WHY IS IT so easy to repeat bad habits and so hard to form good ones? Few things can have a more powerful impact on your life than improving your daily habits. And yet it is likely that this time next year you'll be doing the same thing rather than something better. It often feels difficult to keep good habits going for more than a few days, even with sincere effort and the occasional burst of motivation. Habits like exercise, meditation, journaling, and cooking are reasonable for a day or two and then become a hassle. However, once your habits are established, they seem to stick around forever—especially the unwanted ones. Despite our best intentions, unhealthy habits like eating junk food, watching too much television, procrastinating, and smoking can feel impossible to break. Changing our habits is challenging for two reasons: (1) we try to change the wrong thing and (2) we try to change our habits in the wrong way. In this chapter, I'll address the first point. In the chapters that follow, I'll answer the second. Our first mistake is that we try to change the wrong thing. To understand what I mean, consider that there are three levels at which change can occur. You can imagine them like the layers of an onion.THREE LAYERS OF BEHAVIOR CHANGE
FIGURE 3: There are three layers of behavior change: a change in your outcomes, a change in your processes, or a change in your identity. The first layer is changing your outcomes. This level is concerned with changing your results: losing weight, publishing a book, winning a championship. Most of the goals you set are associated with this level of change. The second layer is changing your process. This level is concerned with changing your habits and systems: implementing a new routine at the gym, decluttering your desk for better workflow, developing a meditation practice. Most of the habits you build are associated with this level. The third and deepest layer is changing your identity. This level is concerned with changing your beliefs: your worldview, your self-image, your judgments about yourself and others. Most of the beliefs, assumptions, and biases you hold are associated with this level. Outcomes are about what you get. Processes are about what you do. Identity is about what you believe. When it comes to building habits that last—when it comes to building a system of 1 percent improvements—the problem is not that one level is “better” or “worse” than another. All levels of change are useful in their own way. The problem is the direction of change. Many people begin the process of changing their habits by focusing on what they want to achieve. This leads us to outcome-based habits. The alternative is to build identity-based habits. With this approach, we start by focusing on who we wish to become. FIGURE 4: With outcome-based habits, the focus is on what you want to achieve. With identity-based habits, the focus is on who you wish to become. Imagine two people resisting a cigarette. When offered a smoke, the first person says, “No thanks. I'm trying to quit.” It sounds like a reasonable response, but this person still believes they are a smoker who is trying to be something else. They are hoping their behavior will change while carrying around the same beliefs. The second person declines by saying, “No thanks. I'm not a smoker.” It's a small difference, but this statement signals a shift in identity. Smoking was part of their former life, not their current one. They no longer identify as someone who smokes. Most people don't even consider identity change when they set out to improve. They just think, “I want to be skinny (outcome) and if I stick to this diet, then I'll be skinny (process).” They set goals and determine the actions they should take to achieve those goals without considering the beliefs that drive their actions. They never shift the way they look at themselves, and they don't realize that their old identity can sabotage their new plans for change. Behind every system of actions are a system of beliefs. The system of a democracy is founded on beliefs like freedom, majority rule, and social equality. The system of a dictatorship has a very different set of beliefs like absolute authority and strict obedience. You can imagine many ways to try to get more people to vote in a democracy, but such behavior change would never get off the ground in a dictatorship. That's not the identity of the system. Voting is a behavior that is impossible under a certain set of beliefs. A similar pattern exists whether we are discussing individuals, organizations, or societies. There are a set of beliefs and assumptions that shape the system, an identity behind the habits. Behavior that is incongruent with the self will not last. You may want more money, but if your identity is someone who consumes rather than creates, then you'll continue to be pulled toward spending rather than earning. You may want better health, but if you continue to prioritize comfort over accomplishment, you'll be drawn to relaxing rather than training. It's hard to change your habits if you never change the underlying beliefs that led to your past behavior. You have a new goal and a new plan, but you haven't changed who you are. ~~~ True behavior change is identity change. You might start a habit because of motivation, but the only reason you'll stick with one is that it becomes part of your identity. Anyone can convince themselves to visit the gym or eat healthy once or twice, but if you don't shift the belief behind the behavior, then it is hard to stick with long-term changes. Improvements are only temporary until they become part of who you are. The goal is not to read a book, the goal is to become a reader. The goal is not to run a marathon, the goal is to become a runner. The goal is not to learn an instrument, the goal is to become a musician.THE TWO-STEP PROCESS TO CHANGING YOUR IDENTITY
Your identity emerges out of your habits. You are not born with preset beliefs. Every belief, including those about yourself, is learned and conditioned through experience.* More precisely, your habits are how you embody your identity. When you make your bed each day, you embody the identity of an organized person. When you write each day, you embody the identity of a creative person. When you train each day, you embody the identity of an athletic person. The more you repeat a behavior, the more you reinforce the identity associated with that behavior. In fact, the word identity was originally derived from the Latin words essentitas, which means being, and identidem, which means repeatedly. Your identity is literally your “repeated beingness.” Whatever your identity is right now, you only believe it because you have proof of it. If you go to church every Sunday for twenty years, you have evidence that you are religious. If you study biology for one hour every night, you have evidence that you are studious. If you go to the gym even when it's snowing, you have evidence that you are committed to fitness. The more evidence you have for a belief, the more strongly you will believe it. For most of my early life, I didn't consider myself a writer. If you were to ask any of my high school teachers or college professors, they would tell you I was an average writer at best: certainly not a standout. When I began my writing career, I published a new article every Monday and Thursday for the first few years. As the evidence grew, so did my identity as a writer. I didn't start out as a writer. I became one through my habits.Of course, your habits are not the only actions that influence your identity, but by virtue of their frequency they are usually the most important ones. Each experience in life modifies your self-image, but it's unlikely you would consider yourself a soccer player because you kicked a ball once or an artist because you scribbled a picture. As you repeat these actions, however, the evidence accumulates and your self- image begins to change. The effect of one-off experiences tends to fade away while the effect of habits gets reinforced with time, which means your habits contribute most of the evidence that shapes your identity. In this way, the process of building habits is actually the process of becoming yourself. This is a gradual evolution. We do not change by snapping our fingers and deciding to be someone entirely new. We change bit by bit, day by day, habit by habit. We are continually undergoing microevolutions of the self. Each habit is like a suggestion: “Hey, maybe this is who I am.” If you finish a book, then perhaps you are the type of person who likes reading. If you go to the gym, then perhaps you are the type of person who likes exercise. If you practice playing the guitar, perhaps you are the type of person who likes music. Every action you take is a vote for the type of person you wish to become. No single instance will transform your beliefs, but as the votes build up, so does the evidence of your new identity. This is one reason why meaningful change does not require radical change. Small habits can make a meaningful difference by providing evidence of a new identity. And if a change is meaningful, it actually is big. That's the paradox of making small improvements. Putting this all together, you can see that habits are the path to changing your identity. The most practical way to change who you are is to change what you do. Each time you write a page, you are a writer. Each time you practice the violin, you are a musician. Each time you start a workout, you are an athlete. Each time you encourage your employees, you are a leader. Each habit not only gets results but also teaches you something far more important: to trust yourself. You start to believe you can actually accomplish these things. When the votes mount up and the evidence begins to change, the story you tell yourself begins to change as well. Of course, it works the opposite way, too. Every time you choose to perform a bad habit, it's a vote for that identity. The good news is that you don't need to be perfect. In any election, there are going to be votes for both sides. You don't need a unanimous vote to win an election; you just need a majority. It doesn't matter if you cast a few votes for a bad behavior or an unproductive habit. Your goal is simply to win the majority of the time. New identities require new evidence. If you keep casting the same votes you've always cast, you're going to get the same results you've always had. If nothing changes, nothing is going to change. It is a simple two-step process: 1. Decide the type of person you want to be. 2. Prove it to yourself with small wins. First, decide who you want to be. This holds at any level—as an individual, as a team, as a community, as a nation. What do you want to stand for? What are your principles and values? Who do you wish to become? These are big questions, and many people aren't sure where to begin —but they do know what kind of results they want: to get six-pack abs or to feel less anxious or to double their salary. That's fine. Start there and work backward from the results you want to the type of person who could get those results. Ask yourself, “Who is the type of person that could get the outcome I want?” Who is the type of person that could lose forty pounds? Who is the type of person that could learn a new language? Who is the type of person that could run a successful start-up? For example, “Who is the type of person who could write a book?” It's probably someone who is consistent and reliable. Now your focus shifts from writing a book (outcome-based) to being the type of person who is consistent and reliable (identity-based). This process can lead to beliefs like: “I'm the kind of teacher who stands up for her students.” “I'm the kind of doctor who gives each patient the time and empathy they need.” “I'm the kind of manager who advocates for her employees.” Once you have a handle on the type of person you want to be, you can begin taking small steps to reinforce your desired identity. I have a friend who lost over 100 pounds by asking herself, “What would a healthy person do?” All day long, she would use this question as a guide. Would a healthy person walk or take a cab? Would a healthy person order a burrito or a salad? She figured if she acted like a healthy person long enough, eventually she would become that person. She was right. The concept of identity-based habits is our first introduction to another key theme in this book: feedback loops. Your habits shape your identity, and your identity shapes your habits. It's a two-way street. The formation of all habits is a feedback loop (a concept we will explore in depth in the next chapter), but it's important to let your values, principles, and identity drive the loop rather than your results. The focus should always be on becoming that type of person, not getting a particular outcome.THE REAL REASON HABITS MATTER
Identity change is the North Star of habit change. The remainder of this book will provide you with step-by-step instructions on how to build better habits in yourself, your family, your team, your company, and anywhere else you wish. But the true question is: “Are you becoming the type of person you want to become?” The first step is not what or how, but who. You need to know who you want to be. Otherwise, your quest for change is like a boat without a rudder. And that's why we are starting here. You have the power to change your beliefs about yourself. Your identity is not set in stone. You have a choice in every moment. You can choose the identity you want to reinforce today with the habits you choose today. And this brings us to the deeper purpose of this book and the real reason habits matter. Building better habits isn't about littering your day with life hacks. It's not about flossing one tooth each night or taking a cold shower each morning or wearing the same outfit each day. It's not about achieving external measures of success like earning more money, losing weight, or reducing stress. Habits can help you achieve all of these things, but fundamentally they are not about having something. They are about becoming someone. Ultimately, your habits matter because they help you become the type of person you wish to be. They are the channel through which you develop your deepest beliefs about yourself. Quite literally, you become your habits.Chapter Summary
# There are three levels of change: outcome change, process change, and identity change. # The most effective way to change your habits is to focus not on what you want to achieve, but on who you wish to become. # Your identity emerges out of your habits. Every action is a vote for the type of person you wish to become. # Becoming the best version of yourself requires you to continuously edit your beliefs, and to upgrade and expand your identity. # The real reason habits matter is not because they can get you better results (although they can do that), but because they can change your beliefs about yourself.
Saturday, April 6, 2024
How Your Habits Shape Your Identity (and Vice Versa) - Chapter 2 From The Book Atomic Habits
Friday, April 5, 2024
Building Zero Shot Classifiers For Text Using Large Language Models
The software tools we need for the activity covered in this post are:
- Google Colab
- GitHub
- And last: ChatGPT
Why I needed these three items?
I needed Google Colab to write code. Google Colab allowed to me not care about creating a local environment and setting it up with the required packages such as 'transformers' from Hugging Face.
I needed GitHub to put my code in a place that I can make available to myself anywhere and also to you (my readers).
I needed ChatGPT to get boilerplate code for our particular task. I learnt about the prompts I needed for this activity from the book by Sinan Ozdemir titled:
Quick Start Guide to Large Language Models. Strategies and Best Practices for using ChatGPT and Other LLMs - Addison-Wesley Professional (2023)
Download BookWhat you would need would (I think) is:
Google Colab that is connected for code to my public repository hosted on GitHub.
How to connect Google Colab with GitHub?
1. Once you would open the Google Colab, you get a screen as shown below.
Note that: I am logged in to my Google Account for this task.
2. Next: you would click on "File" at the top left.
You would click on "Open Notebook".
And you would select "GitHub" as shown below:
3. You would fill in the username. It is: ashishjain1547
Once you fill in the username, the "Repository" dropdown would auto populate with public repositories available for that user.
The repository you would select is: "generative_ai_workspace_2024_04_05"
4. Once repo is selected, it's notebooks start appearing below:
Code for zero shot Spam vs. Not Spam classifier using Facebook's BART
from transformers import pipeline def classify_text (email): """ Use Facebook's BART model to classify an email into "spam" or "not spam" Args: email (str): The email to classify Returns: str: The classification of the email """ classifier = pipeline('zero-shot-classification', model='facebook/bart-large-mnli') labels = ['spam', 'not spam'] hypothesis_template = 'This email is {}.' results = classifier(email, labels, hypothesis_template=hypothesis_template) return results['labels'][0]
Usage:
How we used ChatGPT?
# Load the zero-shot classification pipeline with BERT
classifier = pipeline("zero-shot-classification", model="bert-base-uncased")
# Define a list of texts for sentiment analysis
texts = [
"I love this product, it's amazing!",
"This movie was terrible, I hated it.",
"The weather today is fantastic.",
"I feel neutral about this situation."
]
# Define the possible labels/categories for sentiment
candidate_labels = ["positive", "negative", "neutral"]
# Perform zero-shot sentiment analysis
results = classifier(texts, candidate_labels)
# Print the results
for i, text in enumerate(texts):
print(f"Text: {text}")
print("Sentiment probabilities:")
for label, score in zip(results[i]['labels'], results[i]['scores']):
print(f"{label}: {score:.4f}")
print()
Text: I love this product, it's amazing! Sentiment probabilities: positive: 0.4718 negative: 0.2679 neutral: 0.2603 Text: This movie was terrible, I hated it. Sentiment probabilities: negative: 0.3644 positive: 0.3179 neutral: 0.3177 Text: The weather today is fantastic. Sentiment probabilities: positive: 0.4026 negative: 0.3039 neutral: 0.2935 Text: I feel neutral about this situation. Sentiment probabilities: neutral: 0.3587 negative: 0.3248 positive: 0.3166
Additional Note
Question:
How does the BERT know that the problem is of sentiment analysis and not spam-vs-not_spam?
Answer:
In the code provided, the BERT model doesn't inherently "know" that the problem is specifically about sentiment analysis. Instead, it is designed for zero-shot classification, which means it can classify texts into predefined categories without being explicitly trained on those categories.
The key lies in the input data and the labels provided. In the example, we've defined a list of texts that contain statements or reviews, and we've also provided a list of candidate labels (positive
, negative
, neutral
) that represent sentiment categories. When you pass these texts and labels to the BERT model through the zero-shot classification pipeline, it infers the sentiment of each text based on its understanding of the language and the provided labels.
If you were to perform a different classification task, such as spam vs. not spam, you would simply replace the list of candidate labels with the appropriate labels for that task (e.g., spam
, not_spam
). The model doesn't need to be explicitly told the nature of the classification problem because it learns to make predictions based on the provided labels during training.
Thursday, April 4, 2024
EquiLeader (A problem in the concept of Leader in DSA using Boyer-Moore Majority Vote Algo)
Tags: Algorithms,Technology,EquiLeader: Find the index S such that the leaders of the sequences A[0], A[1], ..., A[S] and A[S + 1], A[S + 2], ..., A[N - 1] are the same.
A non-empty array A consisting of N integers is given. The leader of this array is the value that occurs in more than half of the elements of A. An equi leader is an index S such that 0 ≤ S < N - 1 and two sequences A[0], A[1], ..., A[S] and A[S + 1], A[S + 2], ..., A[N - 1] have leaders of the same value. For example, given array A such that: A[0] = 4 A[1] = 3 A[2] = 4 A[3] = 4 A[4] = 4 A[5] = 2 we can find two equi leaders: 0, because sequences: (4) and (3, 4, 4, 4, 2) have the same leader, whose value is 4. 2, because sequences: (4, 3, 4) and (4, 4, 2) have the same leader, whose value is 4. The goal is to count the number of equi leaders. Write a function: def solution(A) that, given a non-empty array A consisting of N integers, returns the number of equi leaders. For example, given: A[0] = 4 A[1] = 3 A[2] = 4 A[3] = 4 A[4] = 4 A[5] = 2 the function should return 2, as explained above. Write an efficient algorithm for the following assumptions: N is an integer within the range [1..100,000]; each element of array A is an integer within the range [-1,000,000,000..1,000,000,000].Taking help from ChatGPT
Our prompt looks like this: You are a programming guide. Here is my problem... ...Please guide how to solve this problem step by step.To solve this problem efficiently, we can use a two-step approach:
- Find the dominator (leader) of the array.
- Count the number of equi leaders.
Here's a step-by-step guide on how to approach this problem:
Step 1: Find the Dominator
Initialize two variables,
candidate
andcount
, wherecandidate
will store the current candidate for the dominator andcount
will store the count of occurrences of the current candidate.Iterate through the array
A
:
- If
count
is 0, set the current element as thecandidate
and setcount
to 1.- If the current element is equal to the
candidate
, incrementcount
by 1.- If the current element is not equal to the
candidate
, decrementcount
by 1.After iterating through the array,
candidate
will hold a potential dominator candidate. We need to verify if it's indeed the dominator by counting its occurrences in the array.Count occurrences of
candidate
.
- If the count of occurrences of
candidate
is not more than half the length of the array, there is no dominator.Step 2: Count the Equi Leaders
def solution(A): candidate = None count = 0 # Find candidate for dominator for num in A: if count == 0: candidate = num count = 1 elif num == candidate: count += 1 else: count -= 1 # Count occurrences of candidate count_candidate = A.count(candidate) # If candidate is not a dominator, return 0 if count_candidate <= len(A) // 2: return 0 leader_count = 0 equi_leader_count = 0 for i, num in enumerate(A): if num == candidate: leader_count += 1 # Check if the leader appears more than half of the elements up to the current index if leader_count > (i + 1) // 2: # Check if the leader appears more than half of the remaining elements remaining_count = count_candidate - leader_count if remaining_count > (len(A) - i - 1) // 2: equi_leader_count += 1 return equi_leader_count # Example usage A = [4, 3, 4, 4, 4, 2] print(solution(A)) # Output should be 2
- Initialize
leader_count
to count the total occurrences of the dominator.- Initialize
equi_leader_count
to count the number of equi leaders.- Iterate through the array again:
- If the current element equals the dominator, increment
leader_count
.- Calculate the remaining count of the dominator from the end of the array.
- If the count of the dominator up to the current index is more than half of the elements seen so far, and the count of the dominator from the current index to the end is more than half of the remaining elements, increment
equi_leader_count
.What is the core idea behind this algo?
The core idea behind this algorithm is to first find the dominator (leader) of the array efficiently, and then utilize the properties of the dominator to count the number of equi leaders.
Finding the Dominator:
- The algorithm efficiently finds the dominator using the Boyer-Moore Majority Vote algorithm. This algorithm allows us to find a majority element (dominator) in linear time complexity with constant space complexity.
- By iterating through the array, the algorithm identifies a potential dominator candidate and verifies whether it's indeed the dominator by counting its occurrences in the array. If the count of occurrences is more than half of the length of the array, it confirms the presence of a dominator.
Counting the Equi Leaders:
- Once the dominator is found, the algorithm iterates through the array again to count both the total occurrences of the dominator and the number of equi leaders.
- It keeps track of the leader count as it iterates through the array. For each index, it checks if the count of the dominator up to that index is more than half of the elements seen so far. If it is, it also checks if the count of the dominator from the current index to the end is more than half of the remaining elements. If both conditions are met, it increments the equi leader count.
- This approach leverages the fact that an equi leader can only occur if the dominator occurs in both subsequences with counts greater than half of the respective lengths. Therefore, by efficiently tracking these counts, the algorithm can accurately count the equi leaders.
By following this two-step approach, the algorithm effectively solves the problem of finding the number of equi leaders in an array, utilizing the properties of the dominator to optimize the solution.
Is it correct the equi-leaders and dominator have the same face value?
Yes, you are correct. In the context of this problem, the equi-leaders and the dominator have the same face value. In the problem statement, an equi-leader is defined as an index S such that two sequences separated at index S have leaders of the same value. The leader of a sequence is defined as the value that occurs in more than half of the elements of that sequence. Since the equi-leader is a subset of the original array, and it's defined based on having the same leader value on both sides of the split, it follows that the dominator of the equi-leader subsequence is the same as the dominator of the original array. Therefore, in the context of this problem, the equi-leaders and the dominator indeed have the same face value.
Dominator (a problem on the concept of Leader in Algorithms using Boyer-Moore Majority Vote Algo)
Tags: Algorithms,Technology,Dominator: Find an index of an array such that its value occurs at more than half of indices in the array.
An array A consisting of N integers is given. The dominator of array A is the value that occurs in more than half of the elements of A. For example, consider array A such that A[0] = 3 A[1] = 4 A[2] = 3 A[3] = 2 A[4] = 3 A[5] = -1 A[6] = 3 A[7] = 3 The dominator of A is 3 because it occurs in 5 out of 8 elements of A (namely in those with indices 0, 2, 4, 6 and 7) and 5 is more than a half of 8. Write a function def solution(A) that, given an array A consisting of N integers, returns index of any element of array A in which the dominator of A occurs. The function should return -1 if array A does not have a dominator. For example, given array A such that A[0] = 3 A[1] = 4 A[2] = 3 A[3] = 2 A[4] = 3 A[5] = -1 A[6] = 3 A[7] = 3 the function may return 0, 2, 4, 6 or 7, as explained above. Write an efficient algorithm for the following assumptions: N is an integer within the range [0..100,000]; each element of array A is an integer within the range [−2,147,483,648..2,147,483,647].Taking help from ChatGPT...
Our prompt looks like this: You are a programming guide. An array A consisting of N integers is given... ... within the range [-2,147,483,648..2,147,483,647]. Please guide how to solve this problem step by step.Algo
To solve this problem, we can use the Boyer-Moore Majority Vote algorithm. The algorithm allows us to find a majority element (dominator in this case) in linear time complexity with constant space complexity. Here's how we can solve it step by step: 1. Initialize two variables, candidate and count, where candidate will store the current candidate for the dominator and count will store the count of occurrences of the current candidate. 2. Iterate through the array A: If count is 0, set the current element as the candidate and set count to 1. If the current element is equal to the candidate, increment count by 1. If the current element is not equal to the candidate, decrement count by 1. 3. After iterating through the array, candidate will hold a potential dominator candidate. We need to verify if it's indeed the dominator by counting its occurrences in the array. 4. Iterate through the array again to count the occurrences of the candidate. If the count of occurrences of candidate is greater than half the length of the array, return the index of any occurrence of candidate. If no dominator is found, return -1. Let's implement this in code:Code
def solution(A): candidate = None count = 0 # Find candidate for dominator for num in A: if count == 0: candidate = num count = 1 elif num == candidate: count += 1 else: count -= 1 # Count occurrences of candidate count_candidate = A.count(candidate) # Check if candidate is the dominator if count_candidate > len(A) // 2: return A.index(candidate) else: return -1 # Example usage A = [3, 4, 3, 2, 3, -1, 3, 3] print(solution(A)) # Output should be 0, 2, 4, 6, or 7
Manhattan stonewall problem on the concept of Stacks
Tags: Algorithms,Technology,StoneWall: Cover "Manhattan skyline" using the minimum number of rectangles.
You are going to build a stone wall. The wall should be straight and N meters long, and its thickness should be constant; however, it should have different heights in different places. The height of the wall is specified by an array H of N positive integers. H[I] is the height of the wall from I to I+1 meters to the right of its left end. In particular, H[0] is the height of the wall's left end and H[N−1] is the height of the wall's right end. The wall should be built of cuboid stone blocks (that is, all sides of such blocks are rectangular). Your task is to compute the minimum number of blocks needed to build the wall. Write a function: def solution(H) that, given an array H of N positive integers specifying the height of the wall, returns the minimum number of blocks needed to build it. For example, given array H containing N = 9 integers: H[0] = 8 H[1] = 8 H[2] = 5 H[3] = 7 H[4] = 9 H[5] = 8 H[6] = 7 H[7] = 4 H[8] = 8 the function should return 7. The figure shows one possible arrangement of seven blocks. Write an efficient algorithm for the following assumptions: N is an integer within the range [1..100,000]; each element of array H is an integer within the range [1..1,000,000,000].Taking help from ChatGPT...
Our prompt looks like: You are a programming guide. You are going to build a stone wall... ... the function should return 7. Please guide how to solve this problem step by step.Algo
To solve this problem, we can use a stack-based approach. The idea is to traverse through the array of heights and maintain a stack that keeps track of the heights encountered so far. We will also keep track of the number of blocks used. Here's how we can solve it step by step: 1. Initialize an empty stack and a variable to count the number of blocks used. 2. Iterate through the array of heights. 3. For each height, check whether it is greater than the top of the stack. If it is, push the height onto the stack. 4. If the height is less than the top of the stack, pop heights from the stack until the top of the stack is less than or equal to the current height. 5. For each height popped from the stack, increment the block count by 1, as each height represents a block. 6. If after popping from the stack, the stack is empty or the current height is greater than the top of the stack, push the current height onto the stack. 7. After iterating through all heights, return the block count. Let's implement this in code:Code
def solution(H): stack = [] # Stack to keep track of heights blocks_used = 0 # Counter for blocks used for height in H: # Pop heights from stack until top is less than or equal to current height while stack and stack[-1] > height: stack.pop() blocks_used += 1 # If stack is empty or current height is greater than top of stack, push height if not stack or height > stack[-1]: stack.append(height) # Add remaining heights in stack to block count blocks_used += len(stack) return blocks_used # Example usage H = [8, 8, 5, 7, 9, 8, 7, 4, 8] print(solution(H)) # Output should be 7 This algorithm has a time complexity of O(N), where N is the number of elements in the array H, as it iterates through the array once.
Friday, March 29, 2024
The Surprising Power of Atomic Habits (Ch 1)
Tags: Book Summary,Behavioral Science,WHY SMALL HABITS MAKE A BIG DIFFERENCE
Here’s how the math works out: if you can get 1 percent better each day for one year, you’ll end up thirty-seven times better by the time you’re done. Conversely, if you get 1 percent worse each day for one year, you’ll decline nearly down to zero. 1% BETTER EVERY DAY 1% worse every day for one year. (0.99)^365 = 00.03 1% better every day for one year. (1.01)^365 = 37.78 ~~~ FIGURE 1: The effects of small habits compound over time. For example, if you can get just 1 percent better each day, you’ll end up with results that are nearly 37 times better after one year. ~~~ The impact created by a change in your habits is similar to the effect of shifting the route of an airplane by just a few degrees. Imagine you are flying from Los Angeles to New York City. If a pilot leaving from LAX adjusts the heading just 3.5 degrees south, you will land in Washington, D.C., instead of New York. Such a small change is barely noticeable at takeoff—the nose of the airplane moves just a few feet— but when magnified across the entire United States, you end up hundreds of miles apart.WHAT PROGRESS IS REALLY LIKE
Imagine that you have an ice cube sitting on the table in front of you. The room is cold and you can see your breath. It is currently twenty-five degrees. Ever so slowly, the room begins to heat up. Twenty-six degrees. Twenty-seven. Twenty-eight. The ice cube is still sitting on the table in front of you. Twenty-nine degrees. Thirty. Thirty-one. Still, nothing has happened. Then, thirty-two degrees. The ice begins to melt. A one-degree shift, seemingly no different from the temperature increases before it, has unlocked a huge change. Breakthrough moments are often the result of many previous actions, which build up the potential required to unleash a major change. This pattern shows up everywhere. Cancer spends 80 percent of its life undetectable, then takes over the body in months. Bamboo can barely be seen for the first five years as it builds extensive root systems underground before exploding ninety feet into the air within six weeks. Similarly, habits often appear to make no difference until you cross a critical threshold and unlock a new level of performance. In the early and middle stages of any quest, there is often a Valley of Disappointment. You expect to make progress in a linear fashion and it’s frustrating how ineffective changes can seem during the first days, weeks, and even months. It doesn’t feel like you are going anywhere. It’s a hallmark of any compounding process: the most powerful outcomes are delayed.THE PLATEAU OF LATENT POTENTIAL
FIGURE 2: We often expect progress to be linear. At the very least, we hope it will come quickly. In reality, the results of our efforts are often delayed. It is not until months or years later that we realize the true value of the previous work we have done. This can result in a “valley of disappointment” where people feel discouraged after putting in weeks or months of hard work without experiencing any results. However, this work was not wasted. It was simply being stored. It is not until much later that the full value of previous efforts is revealed.FORGET ABOUT GOALS, FOCUS ON SYSTEMS INSTEAD
What’s the difference between systems and goals? It’s a distinction I first learned from Scott Adams, the cartoonist behind the Dilbert comic. Goals are about the results you want to achieve. Systems are about the processes that lead to those results. If you’re a coach, your goal might be to win a championship. Your system is the way you recruit players, manage your assistant coaches, and conduct practice. If you’re an entrepreneur, your goal might be to build a million-dollar business. Your system is how you test product ideas, hire employees, and run marketing campaigns. If you’re a musician, your goal might be to play a new piece. Your system is how often you practice, how you break down and tackle difficult measures, and your method for receiving feedback from your instructor. ~~~ A handful of problems arise when you spend too much time thinking about your goals and not enough time designing your systems. Problem #1: Winners and losers have the same goals. Problem #2: Achieving a goal is only a momentary change. Problem #3: Goals restrict your happiness. ...Because the implicit assumption behind any goal is this: “Once I reach my goal, then I’ll be happy.” Problem #4: Goals are at odds with long-term progress The purpose of setting goals is to win the game. The purpose of building systems is to continue playing the game. True long-term thinking is goal-less thinking. It’s not about any single accomplishment. It is about the cycle of endless refinement and continuous improvement. Ultimately, it is your commitment to the process that will determine your progress. ~~~A SYSTEM OF ATOMIC HABITS
If you’re having trouble changing your habits, the problem isn’t you. The problem is your system. Bad habits repeat themselves again and again not because you don’t want to change, but because you have the wrong system for change. You do not rise to the level of your goals. You fall to the level of your systems. Focusing on the overall system, rather than a single goal, is one of the core themes of this book. It is also one of the deeper meanings behind the word atomic. By now, you’ve probably realized that an atomic habit refers to a tiny change, a marginal gain, a 1 percent improvement. But atomic habits are not just any old habits, however small. They are little habits that are part of a larger system. Just as atoms are the building blocks of molecules, atomic habits are the building blocks of remarkable results. Habits are like the atoms of our lives. Each one is a fundamental unit that contributes to your overall improvement. At first, these tiny routines seem insignificant, but soon they build on each other and fuel bigger wins that multiply to a degree that far outweighs the cost of their initial investment. They are both small and mighty. This is the meaning of the phrase atomic habits—a regular practice or routine that is not only small and easy to do, but also the source of incredible power; a component of the system of compound growth.KEY POINTS... AGAIN
#1 Habits are the compound interest of self-improvement. Getting 1 percent better every day counts for a lot in the long-run. #2 Habits are a double-edged sword. They can work for you or against you, which is why understanding the details is essential. #3 Small changes often appear to make no difference until you cross a critical threshold. The most powerful outcomes of any compounding process are delayed. You need to be patient. #4 An atomic habit is a little habit that is part of a larger system. Just as atoms are the building blocks of molecules, atomic habits are the building blocks of remarkable results. #5 If you want better results, then forget about setting goals. Focus on your system instead. #6 You do not rise to the level of your goals. You fall to the level of your systems.
Tuesday, March 19, 2024
Show All Interview Questions
User Registration
First time users, please register...
User Login
If you already have an account, please login...
# | Topic | Subtopic | Question | Options | Answer | Dated | Ques Type | Hint |
---|---|---|---|---|---|---|---|---|
Create a plain HTML form and Link it to a simple Flask API to display the contents of the form
HTML: index page
<div class="container"> <h2>User Registration</h2> <form action="http://127.0.0.1:5000/submit" method="post"> <div class="form-group"> <label for="username">Username:</label> <input type="text" id="username" name="username" required> </div> <div class="form-group"> <label for="password">Password:</label> <input type="password" id="password" name="password" required> </div> <div class="form-group"> <label for="confirm_password">Confirm Password:</label> <input type="password" id="confirm_password" name="confirm_password" required> </div> <button type="submit" class="btn" data-bind="click: customRegister">Register</button> </form> </div>Python Code For Flask API
from flask import Flask, redirect, render_template, request, url_for import json, requests from flask_cors import CORS, cross_origin from flask import Response, jsonify app=Flask(__name__) @app.route('/') def welcome(): return render_template("index.html") @app.route("/submit", methods=["POST"]) def submit(): print(request.data) username = "" password = "" retypepassword = "" if request.method=="POST": username = request.form["username"] password = request.form["password"] retypepassword = request.form["confirm_password"] print(username, password, retypepassword) # return render_template("result.html", username=username, password=password, retypepassword=retypepassword) return "Hi, " + username if __name__=='__main__': app.run(debug=True)
Monday, March 18, 2024
Books on Large Language Models (Mar 2024)
1. Quick Start Guide to Large Language Models: Strategies and Best Practices for Using ChatGPT and Other LLMs Sinan Ozdemir, 2023 2. GPT-3: Building Innovative NLP Products Using Large Language Models Sandra Kublik, 2022 3. Understanding Large Language Models: Learning Their Underlying Concepts and Technologies Thimira Amaratunga, 2023 4. Introduction to Large Language Models for Business Leaders: Responsible AI Strategy Beyond Fear and Hype I. Almeida, 2023 5. Pretrain Vision and Large Language Models in Python: End-to-end Techniques for Building and Deploying Foundation Models on AWS Emily Webber, 2023 6. Modern Generative AI with ChatGPT and OpenAI Models: Leverage the Capabilities of OpenAI's LLM for Productivity and Innovation with GPT3 and GPT4 Valentina Alto, 2023 7. Generative AI with LangChain: Build Large Language Model (LLM) Apps with Python, ChatGPT, and Other LLMs Ben Auffarth, 2023 8. Natural Language Processing with Transformers Lewis Tunstall, 2022 9. Generative AI on AWS Chris Fregly, 2023 10. Decoding GPT: An Intuitive Understanding of Large Language Models Generative AI Machine Learning and Neural Networks Devesh Rajadhyax, 2024 11. Retrieval-Augmented Generation (RAG): Empowering Large Language Models (LLMs) Ray Islam (Mohammad Rubyet Islam), 2023 12. Learn Python Generative AI: Journey from Autoencoders to Transformers to Large Language Models (English Edition) Indrajit Kar, 2024 13. Natural Language Understanding with Python: Combine Natural Language Technology, Deep Learning, and Large Language Models to Create Human-like Language Comprehension in Computer Systems Deborah A. Dahl, 2023 14. Developing Apps with GPT-4 and ChatGPT Olivier Caelen, 2023 15. Generative Deep Learning David Foster, 2022 16. Foundation Models for Natural Language Processing: Pre-trained Language Models Integrating Media Gerhard Paass, 2023 17. What is ChatGPT Doing ... and why Does it Work? Stephen Wolfram, 2023 18. Artificial Intelligence and Large Language Models: An Introduction to the Technological Future Al-Sakib Khan Pathan, 2024 19. Large Language Model-Based Solutions: How to Deliver Value with Cost-Effective Generative AI Applications Shreyas Subramanian, 2024 20. Introduction to Transformers for NLP: With the Hugging Face Library and Models to Solve Problems Shashank Mohan Jain, 2022 21. Generative AI for Leaders Amir Husain, 2023 22. Machine Learning Engineering with Python: Manage the Production Life Cycle of Machine Learning Models Using MLOps with Practical Examples Andrew P. McMahon, 2021 23. Artificial Intelligence Fundamentals for Business Leaders: Up to Date With Generative AI I. Almeida, 2023 24. Transformers For Natural Language Processing: Build, Train, and Fine-tune Deep Neural Network Architectures for NLP with Python, Hugging Face, and OpenAI's GPT-3, ChatGPT, and GPT-4 Denis Rothman, 2022
Saturday, March 9, 2024
What is an RDD in PySpark?
RDD, which stands for Resilient Distributed Dataset, is a fundamental data structure in Apache Spark, a distributed computing framework for big data processing. RDDs are immutable, partitioned collections of objects that can be processed in parallel across a cluster of machines. The term "resilient" in RDD refers to the fault-tolerance feature, meaning that RDDs can recover lost data due to node failures. Here are some key characteristics and properties of RDDs in PySpark: # Immutable: Once created, RDDs cannot be modified. However, you can transform them into new RDDs by applying various operations. # Distributed: RDDs are distributed across multiple nodes in a cluster, allowing for parallel processing. # Partitioned: RDDs are divided into partitions, which are the basic units of parallelism. Each partition can be processed independently on different nodes. # Lazy Evaluation: Transformations on RDDs are lazily evaluated, meaning that the execution is deferred until an action is triggered. This helps optimize the execution plan and avoid unnecessary computations. # Fault-Tolerant: RDDs track the lineage information to recover lost data in case of node failures. This is achieved through the ability to recompute lost partitions based on the transformations applied to the original data. In PySpark, you can create RDDs from existing data in memory or by loading data from external sources such as HDFS, HBase, or other storage systems. Once created, you can perform various transformations (e.g., map, filter, reduce) and actions (e.g., count, collect, save) on RDDs. However, it's worth noting that while RDDs were the primary abstraction in earlier versions of Spark, newer versions have introduced higher-level abstractions like DataFrames and Datasets, which provide a more structured and optimized API for data manipulation and analysis. These abstractions are built on top of RDDs and offer better performance and ease of use in many scenarios.
5 Questions on PySpark Technology
Q1 of 5 Which of the below Spark Core API is used to load the retail.csv file and create RDD? retailRDD = sc.readFile("/HDFSPATH/retail.csv") retailRDD = sc.parallelize("/HDFSPATH/retail.csv") retailRDD = sc.textFile("/HDFSPATH/retail.csv") *** retailRDD = sc.createFile("/HDFSPATH/retail.csv") Q2 of 5 Shane works in data analytics project and needs to process Users event data (UserLogs.csv file). Which of the below code snippet can be used to split the fields with a comma as a delimiter and fetch only the first two fields from it? logsRDD = sc.textFile("/HDFSPATH/UserLogs.csv"); FieldsRDD = logsRDD.map(lambda r : r.split(",")).map(lambda r: (r[0],r[1])) *** logsRDD = sc.parallelize("/HDFSPATH/UserLogs.csv"); FieldsRDD = logsRDD.map(lambda r : r.split(",")).map(lambda r: (r[0],r[1])) logsRDD = sc.parallelize("/HDFSPATH/UserLogs.csv"); FieldsRDD = logsRDD.filter(lambda r : r.split(",")).map(lambda r: (r[0],r[1])) logsRDD = sc.textFile("/HDFSPATH/UserLogs.csv"); FieldsRDD = logsRDD.filter(lambda r : r.split(",")).map(lambda r: (r[0],r[1])) Q3 of 5 Consider a retail scenario where a paired RDD exists with data (ProductName, Price). Price value must be reduced by 500 as a customer discount. Which paired RDD function in spark can be used for this requirement? mapValues() keys() values() map() --- mapValues applies the function logic to the value part of the paired RDD without changing the key Q4 of 5 Consider a banking scenario where credit card transaction logs need to be processed. The log contains CustomerID, CustomerName, CreditCard Number, and TransactionAmount fields. Which code snippet below creates a paired RDD ? logsRDD = sc.textFile("/HDFSPath/Logs.txt"); logsRDD = sc.textFile("/HDFSPath/Logs.txt"); LogsPairedRDD = logsRDD.map(lambda r : r.split(",")).map(lambda r: (r[0],int(r[3]))) *** logsRDD = sc.textFile("/HDFSPath/Logs.txt"); LogsPairedRDD = logsRDD.map(lambda r : r.split(",")).map(lambda r: (r[0],int(r[2]))) logsRDD = sc.textFile("/HDFSPath/Logs.txt").map(lambda r: (r[0],int(r[3]))) Q5 of 5 Consider a Spark scenario where an array must be used as a Broadcast variable. Which of the below code snippet is used to access the broadcast variable value? bv = sc.broadcast(Array(100,200,300)) bv.getValue --- bv = sc.broadcast(Array(100,200,300)) bv.value bv = sc.broadcast(Array(100,200,300)) bv.find bv = sc.broadcast(Array(100,200,300)) bv.fetchValueSpark Core Challenges
Business Scenario Arisconn Cars provides rental car service across the globe. To improve their customer service, the client wants to analyze periodically each car’s sensor data to repair faults and problems in the car. Sensor data from cars are streamed through events hub (data ingestion tool) into Hadoop's HDFS (distributed file system) and analyzed using Spark Core programming to find out cars generating maximum errors. This analysis would help Arisconn to send the service team to repair the cars even before they fail. Below is the Schema of the big dataset of Arisconn Cars which holds 10 million records approximately. [sensorID, carID, latitude, longitude, engine_speed, accelerator_pedal_position, vehicle_speed, torque_at_transmission, fuel_level, typeOfMessage, timestamp] typeOfMessage: INFO, WARN, ERR, DEBUG Arisconn has the below set of requirements to be performed against the dataset: Filter fields - Sensor id, Car id, Latitude, Longitude, Vehicle Speed, TypeOfMessage Filter valid records i.e., discard records containing '?' Filter records holding only error messages (ignore warnings and info messages) Apply aggregation to count number of error messages produced by cars Below is the Python code to implement the first three requirements. #Loading a text file in to an RDD Car_Info = sc.textFile("/HDFSPath/ArisconnDataset.txt"); #Referring the header of the file header=Car_Info.first() #Removing header and splitting records with ',' as delimiter and fetching relevant fields Car_temp = Car_Info.filter(lambda record:record!=header).map(lambda r:r.split(",")).map(lambda c:(c[0],c[1],float([2]),float(c[4]),int(c[6]),c[9])); #Filtering only valid records(records not starting with '?'), and f[1] refers to first field (sensorid) Car_Eng_Specs = Car_temp.filter(lambda f:str(f[1]).startswith("?")) #Filtering records holding only error messages and f[6] refers to 6th field (Typeofmessage) Car_Error_logs = Car_Eng_Specs.filter(lambda f:str(f[6]).startswith("ERR")) In the above code, Arisconn's dataset is loaded into RDD (Car_Info) The header of the dataset is removed and only fields (sensorid, carid, latitude, longitude, vehiclespeed, TypeOfMessage) are filtered. Refer to RDD Car_temp Records starting with '?' are removed. Refer to RDD Car_Eng_Specs. Records containing TypeOfMessage = "ERR" get filtered There are few challenges in the above code and even the fourth requirement is too complex to implement in Spark Core. We shall discuss this next.
Friday, March 8, 2024
Voracious Fish (A problem on the concept of Stacks)
Fish: N voracious fish are moving along a river. Calculate how many fish are alive. You are given two non-empty arrays A and B consisting of N integers. Arrays A and B represent N voracious fish in a river, ordered downstream along the flow of the river. The fish are numbered from 0 to N − 1. If P and Q are two fish and P < Q, then fish P is initially upstream of fish Q. Initially, each fish has a unique position. Fish number P is represented by A[P] and B[P]. Array A contains the sizes of the fish. All its elements are unique. Array B contains the directions of the fish. It contains only 0s and/or 1s, where: 0 represents a fish flowing upstream, 1 represents a fish flowing downstream. If two fish move in opposite directions and there are no other (living) fish between them, they will eventually meet each other. Then only one fish can stay alive − the larger fish eats the smaller one. More precisely, we say that two fish P and Q meet each other when P < Q, B[P] = 1 and B[Q] = 0, and there are no living fish between them. After they meet: If A[P] > A[Q] then P eats Q, and P will still be flowing downstream, If A[Q] > A[P] then Q eats P, and Q will still be flowing upstream. We assume that all the fish are flowing at the same speed. That is, fish moving in the same direction never meet. The goal is to calculate the number of fish that will stay alive. For example, consider arrays A and B such that: A[0] = 4 B[0] = 0 A[1] = 3 B[1] = 1 A[2] = 2 B[2] = 0 A[3] = 1 B[3] = 0 A[4] = 5 B[4] = 0 Initially all the fish are alive and all except fish number 1 are moving upstream. Fish number 1 meets fish number 2 and eats it, then it meets fish number 3 and eats it too. Finally, it meets fish number 4 and is eaten by it. The remaining two fish, number 0 and 4, never meet and therefore stay alive. Write a function: def solution(A, B) that, given two non-empty arrays A and B consisting of N integers, returns the number of fish that will stay alive. For example, given the arrays shown above, the function should return 2, as explained above. Write an efficient algorithm for the following assumptions: N is an integer within the range [1..100,000]; each element of array A is an integer within the range [0..1,000,000,000]; each element of array B is an integer that can have one of the following values: 0, 1; the elements of A are all distinct. class Fish(): def __init__(self, size, direction): self.size = size self.direction = direction def solution(A, B): stack = [] survivors = 0 for i in range(len(A)): if B[i] == 1: stack.append(A[i]) else: weightdown = stack.pop() if stack else -1 while weightdown != -1 and weightdown < A[i]: weightdown = stack.pop() if stack else -1 if weightdown == -1: survivors += 1 else: stack.append(weightdown) return survivors + len(stack) Correctness tests ▶ extreme_small 1 or 2 fishes ▶ simple1 simple test ▶ simple2 simple test ▶ small_random small random test, N = ~100 Performance tests ▶ medium_random small medium test, N = ~5,000 ▶ large_random large random test, N = ~100,000 ▶ extreme_range1 all except one fish flowing in the same direction ▶ extreme_range2 all fish flowing in the same direction
Tuesday, March 5, 2024
Brackets (A problem related to Stacks)
Problem
Brackets: Determine whether a given string of parentheses (multiple types) is properly nested. A string S consisting of N characters is considered to be properly nested if any of the following conditions is true: S is empty; S has the form "(U)" or "[U]" or "{U}" where U is a properly nested string; S has the form "VW" where V and W are properly nested strings. For example, the string "{[()()]}" is properly nested but "([)()]" is not. Write a function: class Solution { public int solution(String S); } that, given a string S consisting of N characters, returns 1 if S is properly nested and 0 otherwise. For example, given S = "{[()()]}", the function should return 1 and given S = "([)()]", the function should return 0, as explained above. Write an efficient algorithm for the following assumptions: N is an integer within the range [0..200,000]; string S is made only of the following characters: '(', '{', '[', ']', '}' and/or ')'.Solution
Task Score: 87% Correctness: 100% Performance: 80% class Node(): def __init__(self, x): self.x = x self.next = None class Stack(): # head is default NULL def __init__(self): self.head = None # Checks if stack is empty def isempty(self): if self.head == None: return True else: return False def push(self, x): if self.head == None: self.head = Node(x) else: newnode = Node(x) newnode.next = self.head self.head = newnode def pop(self): if self.head == None: return None else: popped_node = self.head self.head = self.head.next popped_node.next = None return popped_node.x # Returns the head node data def peek(self): if self.isempty(): return None else: return self.head.x def solution(S): s = Stack() for i in range(len(S)): if S[i] in ['(', '{', '[']: s.push(S[i]) elif (S[i] == ')' and s.peek() == '(') or (S[i] == '}' and s.peek() == '{') or (S[i] == ']' and s.peek() == '['): s.pop() if s.isempty(): return 1 else: return 0
Number of disc intersections (Problem in sorting and searching)
Problem
Number Of Disc Intersections: Compute the number of intersections in a sequence of discs. We draw N discs on a plane. The discs are numbered from 0 to N − 1. An array A of N non-negative integers, specifying the radiuses of the discs, is given. The J-th disc is drawn with its center at (J, 0) and radius A[J]. We say that the J-th disc and K-th disc intersect if J ≠ K and the J-th and K-th discs have at least one common point (assuming that the discs contain their borders). The figure below shows discs drawn for N = 6 and A as follows: A[0] = 1 A[1] = 5 A[2] = 2 A[3] = 1 A[4] = 4 A[5] = 0 There are eleven (unordered) pairs of discs that intersect, namely: discs 1 and 4 intersect, and both intersect with all the other discs; disc 2 also intersects with discs 0 and 3. Write a function in Java with following signature: class Solution { public int solution(int[] A); } And in Python with following signature: def solution(A): that, given an array A describing N discs as explained above, returns the number of (unordered) pairs of intersecting discs. The function should return −1 if the number of intersecting pairs exceeds 10,000,000. Given array A shown above, the function should return 11, as explained above. Write an efficient algorithm for the following assumptions: N is an integer within the range [0..100,000]; each element of array A is an integer within the range [0..2,147,483,647].Solution
WITHOUT BINARY SEARCH
Task Score: 62% Correctness: 100% Performance: 25% ~~~ def solution(A): # Implement your solution here # Get the start and end indices of the disks and sort them based on the start indices start_and_end = [] for i in range(len(A)): temp = { 'start': i - A[i], 'end': i + A[i] } start_and_end.append(temp) start_and_end.sort(key = lambda d: d['start']) # Search for the starting index bigger than the current end index intersections = [] for i in range(len(start_and_end)): end = start_and_end[i]['end'] cnt_intersections = 0 for j in range(i+1, len(start_and_end)): if start_and_end[j]['start'] <= end: cnt_intersections += 1 else: break intersections.append(cnt_intersections) return sum(intersections) Ref