The roots of resilience... are to be found in the sense of being understood by and existing in the mind and heart of a loving, attuned, and self-possessed other. —Diana Fosha The Children’s Clinic at the Massachusetts Mental Health Center was filled with disturbed and disturbing kids. They were wild creatures who could not sit still and who hit and bit other children, and sometimes even the staff. They would run up to you and cling to you one moment and run away, terrified, the next. Some masturbated compulsively; others lashed out at objects, pets, and themselves. They were at once starving for affection and angry and defiant. The girls in particular could be painfully compliant. Whether oppositional or clingy, none of them seemed able to explore or play in ways typical for children their age. Some of them had hardly developed a sense of self—they couldn’t even recognize themselves in a mirror. At the time, I knew very little about children, apart from what my two preschoolers were teaching me. But I was fortunate in my colleague Nina Fish-Murray, who had studied with Jean Piaget in Geneva, in addition to raising five children of her own. Piaget based his theories of child development on meticulous, direct observation of children themselves, starting with his own infants, and Nina brought this spirit to the incipient Trauma Center at MMHC. Nina was married to the former chairman of the Harvard psychology department, Henry Murray, one of the pioneers of personality theory, and she actively encouraged any junior faculty members who shared her interests. She was fascinated by my stories about combat veterans because they reminded her of the troubled kids she worked with in the Boston public schools. Nina’s privileged position and personal charm gave us access to the Children’s Clinic, which was run by child psychiatrists who had little interest in trauma. Henry Murray had, among other things, become famous for designing the widely used Thematic Apperception Test. The TAT is a so-called projective test, which uses a set of cards to discover how people’s inner reality shapes their view of the world. Unlike the Rorschach cards we used with the veterans, the TAT cards depict realistic but ambiguous and somewhat troubling scenes: a man and a woman gloomily staring away from each other, a boy looking at a broken violin. Subjects are asked to tell stories about what is going on in the photo, what has happened previously, and what happens next. In most cases their interpretations quickly reveal the themes that preoccupy them. Nina and I decided to create a set of test cards specifically for children, based on pictures we cut out of magazines in the clinic waiting room. Our first study compared twelve six- to eleven-year-olds at the children’s clinic with a group of children from a nearby school who matched them as closely as possible in age, race, intelligence, and family constellation.1 What differentiated our patients was the abuse they had suffered within their families. They included a boy who was severely bruised from repeated beatings by his mother; a girl whose father had molested her at the age of four; two boys who had been repeatedly tied to a chair and whipped; and a girl who, at the age of five, had seen her mother (a prostitute) raped, dismembered, burned, and put into the trunk of a car. The mother’s pimp was suspected of sexually abusing the girl. The children in our control group also lived in poverty in a depressed area of Boston where they regularly witnessed shocking violence. While the study was being conducted, one boy at their school threw gasoline at a classmate and set him on fire. Another boy was caught in crossfire while walking to school with his father and a friend. He was wounded in the groin, and his friend was killed. Given their exposure to such a high baseline level of violence, would their responses to the cards differ from those of the hospitalized children? One of our cards depicted a family scene: two smiling kids watching dad repair a car. Every child who looked at it commented on the danger to the man lying underneath the vehicle. While the control children told stories with benign endings—the car would get fixed, and maybe dad and the kids would drive to McDonald’s—the traumatized kids came up with gruesome tales. One girl said that the little girl in the picture was about to smash in her father’s skull with a hammer. A nine-year-old boy who had been severely physically abused told an elaborate story about how the boy in the picture kicked away the jack, so that the car mangled his father’s body and his blood spurted all over the garage. As they told us these stories, our patients got very excited and disorganized. We had to take considerable time out at the water cooler and going for walks before we could show them the next card. It was little wonder that almost all of them had been diagnosed with ADHD, and most were on Ritalin—though the drug certainly didn’t seem to dampen their arousal in this situation. The abused kids gave similar responses to a seemingly innocuous picture of a pregnant woman silhouetted against a window. When we showed it to the seven-year-old girl who’d been sexually abused at age four, she talked about penises and vaginas and repeatedly asked Nina questions like “How many people have you humped?” Like several of the other sexually abused girls in the study, she became so agitated that we had to stop. A seven-year-old girl from the control group picked up the wistful mood of the picture: Her story was about a widowed lady sadly looking out the window, missing her husband. But in the end, the lady found a loving man to be a good father to her baby. In card after card we saw that, despite their alertness to trouble, the children who had not been abused still trusted in an essentially benign universe; they could imagine ways out of bad situations. They seemed to feel protected and safe within their own families. They also felt loved by at least one of their parents, which seemed to make a substantial difference in their eagerness to engage in schoolwork and to learn. The responses of the clinic children were alarming. The most innocent images stirred up intense feelings of danger, aggression, sexual arousal, and terror. We had not selected these photos because they had some hidden meaning that sensitive people could uncover; they were ordinary images of everyday life. We could only conclude that for abused children, the whole world is filled with triggers. As long as they can imagine only disastrous outcomes to relatively benign situations, anybody walking into a room, any stranger, any image, on a screen or on a billboard might be perceived as a harbinger of catastrophe. In this light the bizarre behavior of the kids at the children’s clinic made perfect sense.2 To my amazement, staff discussions on the unit rarely mentioned the horrific real-life experiences of the children and the impact of those traumas on their feelings, thinking, and self-regulation. Instead, their medical records were filled with diagnostic labels: “conduct disorder” or “oppositional defiant disorder” for the angry and rebellious kids; or “bipolar disorder.” ADHD was a “comorbid” diagnosis for almost all. Was the underlying trauma being obscured by this blizzard of diagnoses? Now we faced two big challenges. One was to learn whether the different worldview of normal children could account for their resilience and, on a deeper level, how each child actually creates her map of the world. The other, equally crucial, question was: Is it possible to help the minds and brains of brutalized children to redraw their inner maps and incorporate a sense of trust and confidence in the future?MEN WITHOUT MOTHERS
The scientific study of the vital relationship between infants and their mothers was started by upper-class Englishmen who were torn from their families as young boys to be sent off to boarding schools, where they were raised in regimented same-sex settings. The first time I visited the famed Tavistock Clinic in London I noticed a collection of black-and-white photographs of these great twentieth-century psychiatrists hanging on the wall going up the main staircase: John Bowlby, Wilfred Bion, Harry Guntrip, Ronald Fairbairn, and Donald Winnicott. Each of them, in his own way, had explored how our early experiences become prototypes for all our later connections with others, and how our most intimate sense of self is created in our minute-to-minute exchanges with our caregivers. Scientists study what puzzles them most, so that they often become experts in subjects that others take for granted. (Or, as the attachment researcher Beatrice Beebe once told me, “most research is me-search.”) These men who studied the role of mothers in children’s lives had themselves been sent off to school at a vulnerable age, sometime between six and ten, long before they should have faced the world alone. Bowlby himself told me that just such boarding-school experiences probably inspired George Orwell’s novel 1984, which brilliantly expresses how human beings may be induced to sacrifice everything they hold dear and true—including their sense of self—for the sake of being loved and approved of by someone in a position of authority. Since Bowlby was close friends with the Murrays, I had a chance to talk with him about his work whenever he visited Harvard. He was born into an aristocratic family (his father was surgeon to the King’s household), and he trained in psychology, medicine, and psychoanalysis at the temples of the British establishment. After attending Cambridge University, he worked with delinquent boys in London’s East End, a notoriously rough and crime-ridden neighborhood that was largely destroyed during the Blitz. During and after his service in World War II, he observed the effects of wartime evacuations and group nurseries that separated young children from their families. He also studied the effect of hospitalization, showing that even brief separations (parents back then were not allowed to visit overnight) compounded the children’s suffering. By the late 1940s Bowlby had become persona non grata in the British psychoanalytic community, as a result of his radical claim that children’s disturbed behavior was a response to actual life experiences—to neglect, brutality, and separation— rather than the product of infantile sexual fantasies. Undaunted, he devoted the rest of his life to developing what came to be called attachment theory.3A SECURE BASE
As we enter this world we scream to announce our presence. Someone immediately engages with us, bathes us, swaddles us, and fills our stomachs, and, best of all, our mother may put us on her belly or breast for delicious skin-to-skin contact. We are profoundly social creatures; our lives consist of finding our place within the community of human beings. I love the expression of the great French psychiatrist Pierre Janet: “Every life is a piece of art, put together with all means available.” As we grow up, we gradually learn to take care of ourselves, both physically and emotionally, but we get our first lessons in self-care from the way that we are cared for. Mastering the skill of self-regulation depends to a large degree on how harmonious our early interactions with our caregivers are. Children whose parents are reliable sources of comfort and strength have a lifetime advantage—a kind of buffer against the worst that fate can hand them. John Bowlby realized that children are captivated by faces and voices and are exquisitely sensitive to facial expression, posture, tone of voice, physiological changes, tempo of movement and incipient action. He saw this inborn capacity as a product of evolution, essential to the survival of these helpless creatures. Children are also programmed to choose one particular adult (or at most a few) with whom their natural communication system develops. This creates a primary attachment bond. The more responsive the adult is to the child, the deeper the attachment and the more likely the child will develop healthy ways of responding to the people around him. Bowlby would often visit Regent’s Park in London, where he would make systematic observations of the interactions between children and their mothers. While the mothers sat quietly on park benches, knitting or reading the paper, the kids would wander off to explore, occasionally looking over their shoulders to ascertain that Mum was still watching. But when a neighbor stopped by and absorbed his mother’s interest with the latest gossip, the kids would run back and stay close, making sure he still had her attention. When infants and young children notice that their mothers are not fully engaged with them, they become nervous. When their mothers disappear from sight, they may cry and become inconsolable, but as soon as their mothers return, they quiet down and resume their play. Bowlby saw attachment as the secure base from which a child moves out into the world. Over the subsequent five decades research has firmly established that having a safe haven promotes self-reliance and instills a sense of sympathy and helpfulness to others in distress. From the intimate give-and-take of the attachment bond children learn that other people have feelings and thoughts that are both similar to and different from theirs. In other words, they get “in sync” with their environment and with the people around them and develop the self-awareness, empathy, impulse control, and self-motivation that make it possible to become contributing members of the larger social culture. These qualities were painfully missing in the kids at our Children’s Clinic.THE DANCE OF ATTUNEMENT
Children become attached to whoever functions as their primary caregiver. But the nature of that attachment—whether it is secure or insecure—makes a huge difference over the course of a child’s life. Secure attachment develops when caregiving includes emotional attunement. Attunement starts at the most subtle physical levels of interaction between babies and their caretakers, and it gives babies the feeling of being met and understood. As Edinburgh-based attachment researcher Colwyn Trevarthen says: “The brain coordinates rhythmic body movements and guides them to act in sympathy with other people’s brains. Infants hear and learn musicality from their mother’s talk, even before birth.”4 In chapter 4 I described the discovery of mirror neurons, the brain-tobrain links that give us our capacity for empathy. Mirror neurons start functioning as soon as babies are born. When researcher Andrew Meltzoff at the University of Oregon pursed his lips or stuck out his tongue at sixhour- old babies, they promptly mirrored his actions.5 (Newborns can focus their eyes only on objects within eight to twelve inches—just enough see the person who is holding them). Imitation is our most fundamental social skill. It assures that we automatically pick up and reflect the behavior of our parents, teachers, and peers. Most parents relate to their babies so spontaneously that they are barely aware of how attunement unfolds. But an invitation from a friend, the attachment researcher Ed Tronick, gave me the chance to observe that process more closely. Through a one-way mirror at Harvard’s Laboratory of Human Development, I watched a mother playing with her two-month-old son, who was propped in an infant seat facing her. They were cooing to each other and having a wonderful time—until the mother leaned in to nuzzle him and the baby, in his excitement, yanked on her hair. The mother was caught unawares and yelped with pain, pushing away his hand while her face contorted with anger. The baby let go immediately, and they pulled back physically from each other. For both of them the source of delight had become a source of distress. Obviously frightened, the baby brought his hands up to his face to block out the sight of his angry mother. The mother, in turn, realizing that her baby was upset, refocused on him, making soothing sounds in an attempt to smooth things over. The infant still had his eyes covered, but his craving for connection soon reemerged. He started peeking out to see if the coast was clear, while his mother reached toward him with a concerned expression. As she started to tickle his belly, he dropped his arms and broke into a happy giggle, and harmony was reestablished. Infant and mother were attuned again. This entire sequence of delight, rupture, repair, and new delight took slightly less than twelve seconds. Tronick and other researchers have now shown that when infants and caregivers are in sync on an emotional level, they’re also in sync physically.6 Babies can’t regulate their own emotional states, much less the changes in heart rate, hormone levels, and nervous-system activity that accompany emotions. When a child is in sync with his caregiver, his sense of joy and connection is reflected in his steady heartbeat and breathing and a low level of stress hormones. His body is calm; so are his emotions. The moment this music is disrupted—as it often is in the course of a normal day —all these physiological factors change as well. You can tell equilibrium has been restored when the physiology calms down. We soothe newborns, but parents soon start teaching their children to tolerate higher levels of arousal, a job that is often assigned to fathers. (I once heard the psychologist John Gottman say, “Mothers stroke, and fathers poke.”) Learning how to manage arousal is a key life skill, and parents must do it for babies before babies can do it for themselves. If that gnawing sensation in his belly makes a baby cry, the breast or bottle arrives. If he’s scared, someone holds and rocks him until he calms down. If his bowels erupt, someone comes to make him clean and dry. Associating intense sensations with safety, comfort, and mastery is the foundation of selfregulation, self-soothing, and self-nurture, a theme to which I return throughout this book. A secure attachment combined with the cultivation of competency builds an internal locus of control, the key factor in healthy coping throughout life.7 Securely attached children learn what makes them feel good; they discover what makes them (and others) feel bad, and they acquire a sense of agency: that their actions can change how they feel and how others respond. Securely attached kids learn the difference between situations they can control and situations where they need help. They learn that they can play an active role when faced with difficult situations. In contrast, children with histories of abuse and neglect learn that their terror, pleading, and crying do not register with their caregiver. Nothing they can do or say stops the beating or brings attention and help. In effect they’re being conditioned to give up when they face challenges later in life.BECOMING REAL
Bowlby’s contemporary, the pediatrician and psychoanalyst Donald Winnicott, is the father of modern studies of attunement. His minute observations of mothers and children started with the way mothers hold their babies. He proposed that these physical interactions lay the groundwork for a baby’s sense of self—and, with that, a lifelong sense of identity. The way a mother holds her child underlies “the ability to feel the body as the place where the psyche lives.”8 This visceral and kinesthetic sensation of how our bodies are met lays the foundation for what we experience as “real.” Winnicott thought that the vast majority of mothers did just fine in their attunement to their infants—it does not require extraordinary talent to be what he called a “good enough mother.”10 But things can go seriously wrong when mothers are unable to tune in to their baby’s physical reality. If a mother cannot meet her baby’s impulses and needs, “the baby learns to become the mother’s idea of what the baby is.” Having to discount its inner sensations, and trying to adjust to its caregiver’s needs, means the child perceives that “something is wrong” with the way it is. Children who lack physical attunement are vulnerable to shutting down the direct feedback from their bodies, the seat of pleasure, purpose, and direction. In the years since Bowlby’s and Winnicott’s ideas were introduced, attachment research around the world has shown that the vast majority of children are securely attached. When they grow up, their history of reliable, responsive caregiving will help to keep fear and anxiety at bay. Barring exposure to some overwhelming life event—trauma—that breaks down the self-regulatory system, they will maintain a fundamental state of emotional security throughout their lives. Secure attachment also forms a template for children’s relationships. They pick up what others are feeling and early on learn to tell a game from reality, and they develop a good nose for phony situations or dangerous people. Securely attached children usually become pleasant playmates and have lots of self-affirming experiences with their peers. Having learned to be in tune with other people, they tend to notice subtle changes in voices and faces and to adjust their behavior accordingly. They learn to live within a shared understanding of the world and are likely to become valued members of the community. This upward spiral can, however, be reversed by abuse or neglect. Abused kids are often very sensitive to changes in voices and faces, but they tend to respond to them as threats rather than as cues for staying in sync. Dr. Seth Pollak of the University of Wisconsin showed a series of faces to a group of normal eight-year-olds and compared their responses with those of a group of abused children the same age. Looking at this spectrum of angry to sad expressions, the abused kids were hyperalert to the slightest features of anger. This is one reason abused children so easily become defensive or scared. Imagine what it’s like to make your way through a sea of faces in the school corridor, trying to figure out who might assault you. Children who overreact to their peers’ aggression, who don’t pick up on other kids’ needs, who easily shut down or lose control of their impulses, are likely to be shunned and left out of sleepovers or play dates. Eventually they may learn to cover up their fear by putting up a tough front. Or they may spend more and more time alone, watching TV or playing computer games, falling even further behind on interpersonal skills and emotional self-regulation. The need for attachment never lessens. Most human beings simply cannot tolerate being disengaged from others for any length of time. People who cannot connect through work, friendships, or family usually find other ways of bonding, as through illnesses, lawsuits, or family feuds. Anything is preferable to that godforsaken sense of irrelevance and alienation. A few years ago, on Christmas Eve, I was called to examine a fourteenyear- old boy at the Suffolk County Jail. Jack had been arrested for breaking into the house of neighbors who were away on vacation. The burglar alarm was howling when the police found him in the living room. The first question I asked Jack was who he expected would visit him in jail on Christmas. “Nobody,” he told me. “Nobody ever pays attention to me.” It turned out that he had been caught during break-ins numerous times before. He knew the police, and they knew him. With delight in his voice, he told me that when the cops saw him standing in the middle of the living room, they yelled, “Oh my God, it’s Jack again, that little motherfucker.” Somebody recognized him; somebody knew his name. A little while later Jack confessed, “You know, that is what makes it worthwhile.” Kids will go to almost any length to feel seen and connected.LIVING WITH THE PARENTS YOU HAVE
Children have a biological instinct to attach—they have no choice. Whether their parents or caregivers are loving and caring or distant, insensitive, rejecting, or abusive, children will develop a coping style based on their attempt to get at least some of their needs met. We now have reliable ways to assess and identify these coping styles, thanks largely to the work of two American scientists, Mary Ainsworth and Mary Main, and their colleagues, who conducted thousands of hours of observation of mother-infant pairs over many years. Based on these studies, Ainsworth created a research tool called the Strange Situation, which looks at how an infant reacts to temporary separation from the mother. Just as Bowlby had observed, securely attached infants are distressed when their mother leaves them, but they show delight when she returns, and after a brief check-in for reassurance, they settle down and resume their play. But with infants who are insecurely attached, the picture is more complex. Children whose primary caregiver is unresponsive or rejecting learn to deal with their anxiety in two distinct ways. The researchers noticed that some seemed chronically upset and demanding with their mothers, while others were more passive and withdrawn. In both groups contact with the mothers failed to settle them down—they did not return to play contentedly, as happens in secure attachment. In one pattern, called “avoidant attachment,” the infants look like nothing really bothers them—they don’t cry when their mother goes away and they ignore her when she comes back. However, this does not mean that they are unaffected. In fact, their chronically increased heart rates show that they are in a constant state of hyperarousal. My colleagues and I call this pattern “dealing but not feeling.”12 Most mothers of avoidant infants seem to dislike touching their children. They have trouble snuggling and holding them, and they don’t use their facial expressions and voices to create pleasurable back-and-forth rhythms with their babies. In another pattern, called “anxious” or “ambivalent” attachment, the infants constantly draw attention to themselves by crying, yelling, clinging, or screaming: They are “feeling but not dealing.”13 They seem to have concluded that unless they make a spectacle, nobody is going to pay attention to them. They become enormously upset when they do not know where their mother is but derive little comfort from her return. And even though they don’t seem to enjoy her company, they stay passively or angrily focused on her, even in situations when other children would rather play. Attachment researchers think that the three “organized” attachment strategies (secure, avoidant, and anxious) work because they elicit the best care a particular caregiver is capable of providing. Infants who encounter a consistent pattern of care—even if it’s marked by emotional distance or insensitivity—can adapt to maintain the relationship. That does not mean that there are no problems: Attachment patterns often persist into adulthood. Anxious toddlers tend to grow into anxious adults, while avoidant toddlers are likely to become adults who are out of touch with their own feelings and those of others. (As in, “There’s nothing wrong with a good spanking. I got hit and it made me the success I am today.”) In school avoidant children are likely to bully other kids, while the anxious children are often their victims. However, development is not linear, and many life experiences can intervene to change these outcomes. But there is another group that is less stably adapted, a group that makes up the bulk of the children we treat and a substantial proportion of the adults who are seen in psychiatric clinics. Some twenty years ago, Mary Main and her colleagues at Berkeley began to identify a group of children (about 15 percent of those they studied) who seemed to be unable to figure out how to engage with their caregivers. The critical issue turned out to be that the caregivers themselves were a source of distress or terror to the children. Children in this situation have no one to turn to, and they are faced with an unsolvable dilemma; their mothers are simultaneously necessary for survival and a source of fear. They “can neither approach (the secure and ambivalent ‘strategies’), shift [their] attention (the avoidant ‘strategy’), nor flee.” If you observe such children in a nursery school or attachment laboratory, you see them look toward their parents when they enter the room and then quickly turn away. Unable to choose between seeking closeness and avoiding the parent, they may rock on their hands and knees, appear to go into a trance, freeze with their arms raised, or get up to greet their parent and then fall to the ground. Not knowing who is safe or whom they belong to, they may be intensely affectionate with strangers or may trust nobody. Main called this pattern “disorganized attachment.” Disorganized attachment is “fright without solution.”BECOMING DISORGANIZED WITHIN
Conscientious parents often become alarmed when they discover attachment research, worrying that their occasional impatience or their ordinary lapses in attunement may permanently damage their kids. In real life there are bound to be misunderstandings, inept responses, and failures of communication. Because mothers and fathers miss cues or are simply preoccupied with other matters, infants are frequently left to their own devices to discover how they can calm themselves down. Within limits this is not a problem. Kids need to learn to handle frustrations and disappointments. With “good enough” caregivers, children learn that broken connections can be repaired. The critical issue is whether they can incorporate a feeling of being viscerally safe with their parents or other caregivers. In a study of attachment patterns in over two thousand infants in “normal” middle-class environments, 62 percent were found to be secure, 15 percent avoidant, 9 percent anxious (also known as ambivalent), and 15 percent disorganized.21 Interestingly, this large study showed that the child’s gender and basic temperament have little effect on attachment style; for example, children with “difficult” temperaments are not more likely to develop a disorganized style. Kids from lower socioeconomic groups are more likely to be disorganized, with parents often severely stressed by economic and family instability. Children who don’t feel safe in infancy have trouble regulating their moods and emotional responses as they grow older. By kindergarten, many disorganized infants are either aggressive or spaced out and disengaged, and they go on to develop a range of psychiatric problems.23 They also show more physiological stress, as expressed in heart rate, heart rate variability,24 stress hormone responses, and lowered immune factors. Does this kind of biological dysregulation automatically reset to normal as a child matures or is moved to a safe environment? So far as we know, it does not. Parental abuse is not the only cause of disorganized attachment: Parents who are preoccupied with their own trauma, such as domestic abuse or rape or the recent death of a parent or sibling, may also be too emotionally unstable and inconsistent to offer much comfort and protection. While all parents need all the help they can get to help raise secure children, traumatized parents, in particular, need help to be attuned to their children’s needs. Caregivers often don’t realize that they are out of tune. I vividly remember a videotape Beatrice Beebe showed me.28 It featured a young mother playing with her three-month-old infant. Everything was going well until the baby pulled back and turned his head away, signaling that he needed a break. But the mother did not pick up on his cue, and she intensified her efforts to engage him by bringing her face closer to his and increasing the volume of her voice. When he recoiled even more, she kept bouncing and poking him. Finally he started to scream, at which point the mother put him down and walked away, looking crestfallen. She obviously felt terrible, but she had simply missed the relevant cues. It’s easy to imagine how this kind of misattunement, repeated over and over again, can gradually lead to a chronic disconnection. (Anyone who’s raised a colicky or hyperactive baby knows how quickly stress rises when nothing seems to make a difference.) Chronically failing to calm her baby down and establish an enjoyable face-to-face interaction, the mother is likely to come to perceive him as a difficult child who makes her feel like a failure, and give up on trying to comfort her child. In practice it often is difficult to distinguish the problems that result from disorganized attachment from those that result from trauma: They are often intertwined. My colleague Rachel Yehuda studied rates of PTSD in adult New Yorkers who had been assaulted or raped.29 Those whose mothers were Holocaust survivors with PTSD had a significantly higher rate of developing serious psychological problems after these traumatic experiences. The most reasonable explanation is that their upbringing had left them with a vulnerable physiology, making it difficult for them to regain their equilibrium after being violated. Yehuda found a similar vulnerability in the children of pregnant women who were in the World Trade Center that fatal day in 2001. Similarly, the reactions of children to painful events are largely determined by how calm or stressed their parents are. My former student Glenn Saxe, now chairman of the Department of Child and Adolescent Psychiatry at NYU, showed that when children were hospitalized for treatment of severe burns, the development of PTSD could be predicted by how safe they felt with their mothers.31 The security of their attachment to their mothers predicted the amount of morphine that was required to control their pain—the more secure the attachment, the less painkiller was needed. Another colleague, Claude Chemtob, who directs the Family Trauma Research Program at NYU Langone Medical Center, studied 112 New York City children who had directly witnessed the terrorist attacks on 9/11. Children whose mothers were diagnosed with PTSD or depression during follow-up were six times more likely to have significant emotional problems and eleven times more likely to be hyperaggressive in response to their experience. Children whose fathers had PTSD showed behavioral problems as well, but Chemtob discovered that this effect was indirect and was transmitted via the mother. (Living with an irascible, withdrawn, or terrified spouse is likely to impose a major psychological burden on the partner, including depression.) If you have no internal sense of security, it is difficult to distinguish between safety and danger. If you feel chronically numbed out, potentially dangerous situations may make you feel alive. If you conclude that you must be a terrible person (because why else would your parents have you treated that way?), you start expecting other people to treat you horribly. You probably deserve it, and anyway, there is nothing you can do about it. When disorganized people carry self-perceptions like these, they are set up to be traumatized by subsequent experiences.THE LONG-TERM EFFECTS OF DISORGANIZED ATTACHMENT
In the early 1980s my colleague Karlen Lyons-Ruth, a Harvard attachment researcher, began to videotape face-to-face interactions between mothers and their infants at six months, twelve months and eighteen months. She taped them again when the children were five years old and once more when they were seven or eight. All were from high-risk families: 100 percent met federal poverty guidelines, and almost half the mothers were single parents. Disorganized attachment showed up in two different ways: One group of mothers seemed to be too preoccupied with their own issues to attend to their infants. They were often intrusive and hostile; they alternated between rejecting their infants and acting as if they expected them to respond to their needs. Another group of mothers seemed helpless and fearful. They often came across as sweet or fragile, but they didn’t know how to be the adult in the relationship and seemed to want their children to comfort them. They failed to greet their children after having been away and did not pick them up when the children were distressed. The mothers didn’t seem to be doing these things deliberately—they simply didn’t know how to be attuned to their kids and respond to their cues and thus failed to comfort and reassure them. The hostile/intrusive mothers were more likely to have childhood histories of physical abuse and/or of witnessing domestic violence, while the withdrawn/dependent mothers were more likely to have histories of sexual abuse or parental loss (but not physical abuse).35 I have always wondered how parents come to abuse their kids. After all, raising healthy offspring is at the very core of our human sense of purpose and meaning. What could drive parents to deliberately hurt or neglect their children? Karlen’s research provided me with one answer: Watching her videos, I could see the children becoming more and more inconsolable, sullen, or resistant to their misattuned mothers. At the same time, the mothers became increasingly frustrated, defeated, and helpless in their interactions. Once the mother comes to see the child not as her partner in an attuned relationship but as a frustrating, enraging, disconnected stranger, the stage is set for subsequent abuse. About eighteen years later, when these kids were around twenty years old, Lyons-Ruth did a follow-up study to see how they were coping. Infants with seriously disrupted emotional communication patterns with their mothers at eighteen months grew up to become young adults with an unstable sense of self, self-damaging impulsivity (including excessive spending, promiscuous sex, substance abuse, reckless driving, and binge eating), inappropriate and intense anger, and recurrent suicidal behavior. Karlen and her colleagues had expected that hostile/intrusive behavior on the part of the mothers would be the most powerful predictor of mental instability in their adult children, but they discovered otherwise. Emotional withdrawal had the most profound and long-lasting impact. Emotional distance and role reversal (in which mothers expected the kids to look after them) were specifically linked to aggressive behavior against self and others in the young adults.DISSOCIATION: KNOWING AND NOT KNOWING
Lyons-Ruth was particularly interested in the phenomenon of dissociation, which is manifested in feeling lost, overwhelmed, abandoned, and disconnected from the world and in seeing oneself as unloved, empty, helpless, trapped, and weighed down. She found a “striking and unexpected” relationship between maternal disengagement and misattunement during the first two years of life and dissociative symptoms in early adulthood. Lyons-Ruth concludes that infants who are not truly seen and known by their mothers are at high risk to grow into adolescents who are unable to know and to see.”36 Infants who live in secure relationships learn to communicate not only their frustrations and distress but also their emerging selves—their interests, preferences, and goals. Receiving a sympathetic response cushions infants (and adults) against extreme levels of frightened arousal. But if your caregivers ignore your needs, or resent your very existence, you learn to anticipate rejection and withdrawal. You cope as well as you can by blocking out your mother’s hostility or neglect and act as if it doesn’t matter, but your body is likely to remain in a state of high alert, prepared to ward off blows, deprivation, or abandonment. Dissociation means simultaneously knowing and not knowing.37 Bowlby wrote: “What cannot be communicated to the [m]other cannot be communicated to the self.”38 If you cannot tolerate what you know or feel what you feel, the only option is denial and dissociation.39 Maybe the most devastating long-term effect of this shutdown is not feeling real inside, a condition we saw in the kids in the Children’s Clinic and that we see in the children and adults who come to the Trauma Center. When you don’t feel real nothing matters, which makes it impossible to protect yourself from danger. Or you may resort to extremes in an effort to feel something— even cutting yourself with a razor blade or getting into fistfights with strangers. Karlen’s research showed that dissociation is learned early: Later abuse or other traumas did not account for dissociative symptoms in young adults.40 Abuse and trauma accounted for many other problems, but not for chronic dissociation or aggression against self. The critical underlying issue was that these patients didn’t know how to feel safe. Lack of safety within the early caregiving relationship led to an impaired sense of inner reality, excessive clinging, and self-damaging behavior: Poverty, single parenthood, or maternal psychiatric symptoms did not predict these symptoms. This does not imply that child abuse is irrelevant41, but that the quality of early caregiving is critically important in preventing mental health problems, independent of other traumas. For that reason treatment needs to address not only the imprints of specific traumatic events but also the consequences of not having been mirrored, attuned to, and given consistent care and affection: dissociation and loss of self-regulation.RESTORING SYNCHRONY
Early attachment patterns create the inner maps that chart our relationships throughout life, not only in terms of what we expect from others, but also in terms of how much comfort and pleasure we can experience in their presence. I doubt that the poet e. e. cummings could have written his joyous lines: “I like my body when it is with your body.... muscles better and nerves more” if his earliest experiences had been frozen faces and hostile glances. Our relationship maps are implicit, etched into the emotional brain and not reversible simply by understanding how they were created. You may realize that your fear of intimacy has something to do with your mother’s postpartum depression or with the fact that she herself was molested as a child, but that alone is unlikely to open you to happy, trusting engagement with others. However, that realization may help you to start exploring other ways to connect in relationships—both for your own sake and in order to not pass on an insecure attachment to your own children. In part 5 I’ll discuss a number of approaches to healing damaged attunement systems through training in rhythmicity and reciprocity. Being in synch with oneself and with others requires the integration of our body-based senses—vision, hearing, touch, and balance. If this did not happen in infancy and early childhood, there is an increased chance of later sensory integration problems (to which trauma and neglect are by no means the only pathways). Being in synch means resonating through sounds and movements that connect, which are embedded in the daily sensory rhythms of cooking and cleaning, going to bed and waking up. Being in synch may mean sharing funny faces and hugs, expressing delight or disapproval at the right moments, tossing balls back and forth, or singing together. At the Trauma Center, we have developed programs to coach parents in connection and attunement, and my patients have told me about many other ways to get themselves in synch, ranging from choral singing and ballroom dancing to joining basketball teams, jazz bands and chamber music groups. All of these foster a sense of attunement and communal pleasure.
Friday, June 10, 2022
Ch 7: Getting on the same wavelength: Attachment and Attunement (Bessel van red Kolk - The Body Keeps the score (2022))
Statistics Books (2022-June)
Download Books
1. Naked Statistics: Stripping the Dread from the Data Book by Charles Wheelan 2. An Introduction to Statistical Learning: With Applications in R Originally published: 24 June 2013 Authors: Gareth M. James, Daniela Witten, Trevor Hastie, Robert Tibshirani Editor: Gareth M. James Original language: English Genre: Textbook 3. The Elements of Statistical Learning Book by Jerome H. Friedman, Robert Tibshirani, and Trevor Hastie 4. Practical Statistics for Data Scientists: 50+ Essential Concepts Using R and Python Book by Andrew Bruce, Peter C. Bruce, and Peter Gedeck 5. All of Statistics: A Concise Course in Statistical Inference Book by Larry A. Wasserman 6. The Art of Statistics: How to Learn from Data Book by David Spiegelhalter 7. The Signal and the Noise: Why So Many Predictions Fail-but Some Don't Book by Nate Silver 8. Head First Statistics Book by Dawn Griffiths 9. Think Stats Book by Allen B. Downey 10. How to Lie with Statistics Book by Darrell Huff 11. Statistics in plain English Book by Timothy C. Urdan 12. OpenIntro Statistics Textbook by Christopher Barr, David Diez, and Mine Çetinkaya-Rundel 13. The cartoon guide to statistics Book by Larry Gonick 14. The Book of Why Book by Dana Mackenzie and Judea Pearl 15. Pattern Recognition and Machine Learning Book by Christopher Bishop 16. Discovering Statistics Using R Book by Andy Field and Jeremy Miles 17. Statistics for Dummies Book by Deborah Rumsey 18. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition Originally published: 2009 19. The Visual Display of Quantitative Information Book by Edward Tufte 20. Statistics Done Wrong: The Woefully Complete Guide Book by Alex Reinhart 21. Applied Predictive Modeling Book by Kjell Johnson and Max Kuhn 22. Workbook to accompany Statistics for Business and Economics Originally published: 1984 Authors: David R. Anderson, Thomas A. Williams, Dennis J. Sweeney, Thomas Arthur Williams 23. Probability and statistics Book by Morris H. DeGroot 24. First Course in Probability, A Book by Sheldon M. Ross 25. Mathematical statistics with applications Textbook by Dennis D. Wackerly 26. Principles of statistics Book by M. G. Bulmer 27. Statistics for People Who Book by Neil J. Salkind 28. Discovering Statistics Using IBM SPSS Statistics Book by Andy Field 29. Statistics II for Dummies Book by Deborah Rumsey 30. R for Data Science Book by Garrett Grolemund and Hadley Wickham 31. Weapons of Math Destruction Book by Cathy O'Neil 32. Statistics for experimenters Book by George E. P. Box 33. Bayesian Data Analysis Originally published: 1995 Authors: Andrew Gelman, Aki Vehtari, John Carlin, Hal S. Stern, David Dunson, Donald Rubin Editor: Andrew Gelman 34. Data Analysis Using Regression and Multilevel/Hierarchical Models Book by Andrew Gelman and Jennifer Hill 35. Thinking Statistically Book by Uri Bram 36. How Not to Be Wrong Book by Jordan Ellenberg 37. Bayesian Statistics the Fun Way: Understanding Statistics and Probability with Star Wars, LEGO, and Rubber Ducks Book by Will Kurt 38. Statistics (Fourth Edition) Textbook by David A. Freedman, Robert Pisani, and Roger Purves 39. Introduction to Mathematical Statistics Edition Textbook by Robert V. Hogg 40. The Lady Tasting Tea Book by David Salsburg 41. The Drunkard's Walk: How Randomness Rules Our Lives Book by Leonard Mlodinow 42. The Art of R Programming: A Tour of Statistical Software Design Book by Norman Matloff 43. Freakonomics Book by Stephen J. Dubner and Steven Levitt 44. Statistical Rethinking: A Bayesian Course with Examples in R and Stan Book by Richard McElreath 45. Statistics Book by David A. Freedman 46. Student Study Guide to Accompany Statistics Alive! Book by Matthew Price, Wendy J. Steinberg, and Zoe Brier 47. A first course in linear model theory Book by Nalini Ravishanker 48. Statistics Essentials for Dummies Book by Deborah Rumsey 49. Probability and statistics Book by Frederick Mosteller 50. Statistics Book by Robert S. Witte 51. Introduction to Linear Algebra, 3rd Edition Textbook by Gilbert Strang 52. Street-Fighting Mathematics: The Art of Educated Guessing and Opportunistic Problem Solving Book by Sanjoy Mahajan An antidote to mathematical rigor mortis, teaching how to guess answers without needing a proof or an exact calculation.In problem solving, as in street fighting, rules are for fools: do whatever works—don't just stand there! Yet we often fear an unjustified leap even though it may land us on a correct result. Originally published: 2010Tags: Mathematical Foundations for Data Science,List of Books,
Thursday, June 9, 2022
Creating a Taxonomy for BBC News Articles (Part 7 - Labeled Data for Taxonomy of News Articles from IPTC)
Tags: Technology,Natural Language Processing,Data Set (A)
Hypernym_hyponym_mapping_for_IPTC_news_articles.csv View A.1: View A.2:Data Set (B)
Taxonomy of News Articles from IPTC.xlsx View B.1: View B.2: View B.3: View B.4:
Wednesday, June 8, 2022
Problem on odd-even and factorization of a number
Question: Tell an odd number that is lesser than 50 and it has at least 5 factors. Ans: We have to first define what we call factors here: Way 1: Non-trivial factors alone. That means we would exclude 1 and the number itself. Way 2: The number 1 and the number itself are the 'Trivial Factors' of a number. We include both trivial and non-trivial factors. If we look at Way 2, then 45 is an answer. 45 has factors: 1, 3, 5, 9, 15, 45 If we consider only non-trivial factors, then there is no number that is lesser than 50 and would yield five factors. Let us approach the problem from back-to-forth. Since, the number should be odd, there would be no 2 in its factors. So the lowest factor for any number could 3. Now: Let us multiply 3 with 3: 3 * 3 = 9 Now we multiply 9 with 3: 27 Now we multiply 27 with 3: 81. And 81 (that is 3 * 3 * 3 * 3) is greater than 50. So, with Way 1: There is no answer to this question.Tags: Mathematical Foundations for Data Science,
Prime Factors of a Number (Solver)
Enter a number to find it's prime factors.
Tags: Technology,Mathematical Foundations for Data Science,Web Development,JavaScript,
Tuesday, June 7, 2022
Creating a Taxonomy for BBC News Articles (Part 6 based on - A Hybrid Approach to Hypernym Discovery)
Tags: Technology,Natural Language Processing,The difference between Part 5 and Part 6.
In Part 5, we were using Cosine Distance between input text and output label. In Part 6, we are first finding the dot product and then getting a probability using the sigmoid function similar to Logistic Regression. A sigmoid function is a mathematical function having a characteristic "S"-shaped curve or sigmoid curve. import pandas as pd import numpy as np from sentence_transformers import SentenceTransformer from sklearn.metrics.pairwise import cosine_similarity # Expects 2D arrays as input from scipy.spatial.distance import cosine # Works with 1D vectors from sklearn.metrics import classification_report smodel = SentenceTransformer('distilbert-base-nli-mean-tokens') df1 = pd.read_csv('bbc_news_train.csv') df1.head() def get_sentence_vector(query): query_vec = smodel.encode([query])[0] return query_vec df1['textVec'] = df1['Text'].apply(lambda x: get_sentence_vector(x)) def std_category(x): if(x == 'tech'): return 'technology' elif (x == 'sport'): return 'sports' else: return x df1['Category'] = df1['Category'].apply(std_category) import math def sigmoid(x): return 1 / (1 + math.exp(-x)) def get_logistic_regression_probability(x, Y): y = smodel.encode([Y])[0] d = np.dot(x, y) s = sigmoid(d) return s df1['proba_business'] = df1['textVec'].apply(lambda x: get_logistic_regression_probability(x, 'business')) # CPU times: total: 2min 1s. Wall time: 1min 1s df1['Category'].unique() # OUTPUT: array(['business', 'technology', 'politics', 'sports', 'entertainment'], dtype=object) df1['proba_technology'] = df1['textVec'].apply(lambda x: get_logistic_regression_probability(x, 'technology')) df1['proba_politics'] = df1['textVec'].apply(lambda x: get_logistic_regression_probability(x, 'politics')) df1['proba_sports'] = df1['textVec'].apply(lambda x: get_logistic_regression_probability(x, 'sports')) df1['proba_entertainment'] = df1['textVec'].apply(lambda x: get_logistic_regression_probability(x, 'entertainment')) def get_prediction(in_row): max_proba = 0 label = "" for i in ['proba_business', 'proba_technology', 'proba_politics', 'proba_sports', 'proba_entertainment']: d = in_row[i] if d > max_proba: max_proba = d label = i.split('_')[1] return label df1['prediction'] = df1.apply(lambda in_row: get_prediction(in_row), axis = 1) target_names = ['business', 'entertainment', 'politics', 'sports', 'technology'] print(classification_report(df1['Category'], df1['prediction'], target_names=target_names))
Creating a Taxonomy for BBC News Articles (Part 5 based on - A Hybrid Approach to Hypernym Discovery)
import pandas as pd from sentence_transformers import SentenceTransformer from sklearn.metrics.pairwise import cosine_similarity # Expects 2D arrays as input from scipy.spatial.distance import cosine # Works with 1D vectors from sklearn.metrics import classification_report smodel = SentenceTransformer('distilbert-base-nli-mean-tokens') df1 = pd.read_csv('bbc_news_train.csv') df1.head() def get_sentence_vector(query): query_vec = smodel.encode([query])[0] return query_vec %%time df1['textVec'] = df1['Text'].apply(lambda x: get_sentence_vector(x)) df1.head() def std_category(x): if(x == 'tech'): return 'technology' elif (x == 'sport'): return 'sports' else: return x df1['Category'] = df1['Category'].apply(std_category) def get_cosine_sim(x, Y): y = smodel.encode([Y])[0] return cosine(x, y) df1['cdist_business'] = df1['textVec'].apply(lambda x: get_cosine_sim(x, 'business')) df1['Category'].unique() array(['business', 'technology', 'politics', 'sports', 'entertainment'], dtype=object) df1['cdist_technology'] = df1['textVec'].apply(lambda x: get_cosine_sim(x, 'technology')) df1['cdist_politics'] = df1['textVec'].apply(lambda x: get_cosine_sim(x, 'politics')) df1['cdist_sports'] = df1['textVec'].apply(lambda x: get_cosine_sim(x, 'sports')) df1['cdist_entertainment'] = df1['textVec'].apply(lambda x: get_cosine_sim(x, 'entertainment')) def get_prediction(in_row): min_dist = 99999999 label = "" for i in ['cdist_business', 'cdist_technology', 'cdist_politics', 'cdist_sports', 'cdist_entertainment']: d = in_row[i] if d < min_dist: min_dist = d label = i.split('_')[1] return label df1['prediction'] = df1.apply(lambda in_row: get_prediction(in_row), axis = 1) df1.head() target_names = ['business', 'entertainment', 'politics', 'sports', 'technology'] print(classification_report(df1['Category'], df1['prediction'], target_names=target_names)) from collections import Counter Counter(df1['Category']) Counter({'business': 336, 'technology': 261, 'politics': 274, 'sports': 346, 'entertainment': 273})Tags: Technology,Natural Language Processing,
Monday, June 6, 2022
Reasoning With Word Vectors (Word2Vec, GloVe, fastText and Doc2Vec)
Tags: Technology,Natural Language Processing,PythonWord2Vec
In 2012, Thomas Mikolov, an intern at Microsoft, found a way to encode the meaning of words in a modest number of vector dimensions. Mikolov trained a neural network5 to predict word occurrences near each target word. In 2013, once at Google, Mikolov and his teammates released the software for creating these word vectors and called it Word2vec. Word2vec learns the meaning of words merely by processing a large corpus of unlabeled text. No one has to label the words in the Word2vec vocabulary. No one has to tell the Word2vec algorithm that Marie Curie is a scientist, that the Timbers are a soccer team, that Seattle is a city, or that Portland is a city in both Oregon and Maine. And no one has to tell Word2vec that soccer is a sport, or that a team is a group of people, or that cities are both places as well as communities. Word2vec can learn that and much more, all on its own! All you need is a corpus large enough to mention Marie Curie and Timbers and Portland near other words associated with science or soccer or cities. This unsupervised nature of Word2vec is what makes it so powerful. The world is full of unlabeled, uncategorized, unstructured natural language text. Instead of trying to train a neural network to learn the target word meanings directly (on the basis of labels for that meaning), you teach the network to predict words near the target word in your sentences. So in this sense, you do have labels: the nearby words you’re trying to predict. But because the labels are coming from the dataset itself and require no hand-labeling, the Word2vec training algorithm is definitely an unsupervised learning algorithm. And the prediction itself isn’t what makes Word2vec work. The prediction is merely a means to an end. What you do care about is the internal representation, the vector that Word2vec gradually builds up to help it generate those predictions. Word2vec will learn about things you might not think to associate with all words. Did you know that every word has some geography, sentiment (positivity), and gender associated with it? If any word in your corpus has some quality, like “placeness,” “peopleness,” “conceptness,” or “femaleness,” all the other words will also be given a score for these qualities in your word vectors. The meaning of a word “rubs off” on the neighboring words when Word2vec learns word vectors. Word2vec allows you to transform your natural language vectors of token occurrence counts and frequencies into the vector space of much lower-dimensional Word2vec vectors. In this lower-dimensional space, you can do your math and then convert back to a natural language space. You can imagine how useful this capability is to a chatbot, search engine, question answering system, or information extraction algorithm. The research team also discovered that the difference between a singular and a plural word is often roughly the same magnitude, and in the same direction: Equation 6.2: Distance between the singular and plural versions of a word But their discovery didn’t stop there. They also discovered that the distance relationships go far beyond simple singular versus plural relationships. Distances apply to other semantic relationships. The Word2vec researchers soon discovered they could answer questions that involve geography, culture, and demographics, like this: "San Francisco is to California as what is to Colorado?" San Francisco - California + Colorado = DenverHow to compute Word2vec representations
Word vectors represent the semantic meaning of words as vectors in the context of the training corpus. This allows you not only to answer analogy questions but also reason about the meaning of words in more general ways with vector algebra. But how do you calculate these vector representations? There are two possible ways to train Word2vec embeddings: Skip-gram approach The skip-gram approach predicts the context of words (output words) from a word of interest (the input word). Continuous bag-of-words The continuous bag-of-words (CBOW) approach predicts the target word (the output word) from the nearby words (input words). We show you how and when to use each of these to train a Word2vec model in the coming sections.SKIP-GRAM APPROACH
In the skip-gram training approach, you’re trying to predict the surrounding window of words based on an input word. In the sentence about Monet, in our following example, “painted” is the training input to the neural network. The corresponding training output example skip-grams are shown in figure 6.3. The predicted words for these skipgrams are the neighboring words “Claude,” “Monet,” “the,” and “Grand.” Let's say we have a sentence: Claude Monet painted the Grand Canal of venice in 1908. We are going to predict neighbouring words for the word "Monet".CONTINUOUS BAG-OF-WORDS APPROACH
In the continuous bag-of-words approach, you’re trying to predict the center word based on the surrounding words (see figures 6.5 and 6.6 and table 6.2). Instead of creating pairs of input and output tokens, you’ll create a multi-hot vector of all surrounding terms as an input vector. The multi-hot input vector is the sum of all one-hot vectors of the surrounding tokens to the center, target token.Continuous bag of words vs. bag of words
In previous chapters, we introduced the concept of a bag of words, but how is it different than a continuous bag of words? To establish the relationships between words in a sentence you slide a rolling window across the sentence to select the surrounding words for the target word. All words within the sliding window are considered to be the content of the continuous bag of words for the target word at the middle of that window.SKIP-GRAM VS. CBOW: WHEN TO USE WHICH APPROACH
Mikolov highlighted that the skip-gram approach works well with small corpora and rare terms. With the skip-gram approach, you’ll have more examples due to the network structure. But the continuous bag-of-words approach shows higher accuracies for frequent words and is much faster to train.Negative sampling
One last trick Mikolov came up with was the idea of negative sampling. If a single training example with a pair of words is presented to the network, it’ll cause all weights for the network to be updated. This changes the values of all the vectors for all the words in your vocabulary. But if your vocabulary contains thousands or millions of words, updating all the weights for the large one-hot vector is inefficient. To speed up the training of word vector models, Mikolov used negative sampling. Instead of updating all word weights that weren’t included in the word window, Mikolov suggested sampling just a few negative samples (in the output vector) to update their weights. Instead of updating all weights, you pick n negative example word pairs (words that don’t match your target output for that example) and update the weights that contributed to their specific output. That way, the computation can be reduced dramatically and the performance of the trained network doesn’t decrease significantly. NOTE If you train your word model with a small corpus, you might want to use a negative sampling rate of 5 to 20 samples. For larger corpora and vocabularies, you can reduce the negative sample rate to as low as two to five samples, according to Mikolov and his team.Word2vec vs. GloVe (Global Vectors)
Word2vec was a breakthrough, but it relies on a neural network model that must be trained using backpropagation. Backpropagation is usually less efficient than direct optimization of a cost function using gradient descent. Stanford NLP researchers21 led by Jeffrey Pennington set about to understand the reason why Word2vec worked so well and to find the cost function that was being optimized. They started by counting the word co-occurrences and recording them in a square matrix. They found they could compute the singular value decomposition22 of this co-occurrence matrix, splitting it into the same two weight matrices that Word2vec produces.23 The key was to normalize the co-occurrence matrix the same way. But in some cases the Word2vec model failed to converge to the same global optimum that the Stanford researchers were able to achieve with their SVD approach. It’s this direct optimization of the global vectors of word co-occurrences (co-occurrences across the entire corpus) that gives GloVe its name. GloVe can produce matrices equivalent to the input weight matrix and output weight matrix of Word2vec, producing a language model with the same accuracy as Word2vec but in much less time. GloVe speeds the process by using the text data more efficiently. GloVe can be trained on smaller corpora and still converge.24 And SVD algorithms have been refined for decades, so GloVe has a head start on debugging and algorithm optimization. Word2vec relies on backpropagation to update the weights that form the word embeddings. Neural network backpropagation is less efficient than more mature optimization algorithms such as those used within SVD for GloVe. Even though Word2vec first popularized the concept of semantic reasoning with word vectors, your workhorse should probably be GloVe to train new word vector models. With GloVe you’ll be more likely to find the global optimum for those vector representations, giving you more accurate results. Advantages of GloVe are: # Faster training # Better RAM/CPU efficiency (can handle larger documents) # More efficient use of data (helps with smaller corpora) # More accurate for the same amount of trainingfastText
Researchers from Facebook took the concept of Word2vec one step further25 by adding a new twist to the model training. The new algorithm, which they named fastText, predicts the surrounding n-character grams rather than just the surrounding words, like Word2vec does. For example, the word “whisper” would generate the following 2- and 3-character grams: wh, whi, hi, his, is, isp, sp, spe, pe, per, er fastText trains a vector representation for every n-character gram, which includes words, misspelled words, partial words, and even single characters. The advantage of this approach is that it handles rare words much better than the original Word2vec approach. As part of the fastText release, Facebook published pretrained fastText models for 294 languages. On the Github page of Facebook research,26 you can find models ranging from Abkhazian to Zulu. The model collection even includes rare languages such as Saterland Frisian, which is only spoken by a handful of Germans. The pretrained fastText models provided by Facebook have only been trained on the available Wikipedia corpora. Therefore the vocabulary and accuracy of the models will vary across languages.Word vectors are biased!
Word vectors learn word relationships based on the training corpus. If your corpus is about finance then your “bank” word vector will be mainly about businesses that hold deposits. If your corpus is about geology, then your “bank” word vector will be trained on associations with rivers and streams. And if you corpus is mostly about a matriarchal society with women bankers and men washing clothes in the river, then your word vectors would take on that gender bias. The following example shows the gender bias of a word model trained on Google News articles. If you calculate the distance between “man” and “nurse” and compare that to the distance between “woman” and “nurse,” you’ll be able to see the bias: >>> word_model.distance('man', 'nurse') 0.7453 >>> word_model.distance('woman', 'nurse') 0.5586 Identifying and compensating for biases like this is a challenge for any NLP practitioner that trains her models on documents written in a biased world.Document similarity with Doc2vec
The concept of Word2vec can also be extended to sentences, paragraphs, or entire documents. The idea of predicting the next word based on the previous words can be extended by training a paragraph or document vector (see figure 6.10).33 In this case, the prediction not only considers the previous words, but also the vector representing the paragraph or the document. It can be considered as an additional word input to the prediction. Over time, the algorithm learns a document or paragraph representation from the training set. How are document vectors generated for unseen documents after the training phase? During the inference stage, the algorithm adds more document vectors to the document matrix and computes the added vector based on the frozen word vector matrix, and its weights. By inferring a document vector, you can now create a semantic representation of the whole document. Figure 6.10 Doc2vec training uses an additional document vector as input.
Subscribe to:
Posts (Atom)