Ref: Link to full course on DeepLearning.ai
Thursday, March 27, 2025
Principles of Agentic Code Development
Monday, April 15, 2024
Maximum Slice Problem (using Kadane's Algo and Dynamic Programming)
Maximum slice problem
Let’s define a problem relating to maximum slices. You are given a sequence of n integers a0 , a1 , . . . , an−1 and the task is to find the slice with the largest sum. More precisely, we are looking for two indices p, q such that the total ap + ap+1 + . . . + aq is maximal. We assume that the slice can be empty and its sum equals 0. In the picture, the slice with the largest sum is highlighted in gray. The sum of this slice equals 10 and there is no slice with a larger sum. Notice that the slice we are looking for may contain negative integers, as shown above.
In other words...
A non-empty array A consisting of N integers is given. A pair of integers (P, Q), such that 0 ≤ P ≤ Q < N, is called a slice of array A. The sum of a slice (P, Q) is the total of A[P] + A[P+1] + ... + A[Q]. Write a function: def solution(A) that, given an array A consisting of N integers, returns the maximum sum of any slice of A. For example, given array A such that: A[0] = 3 A[1] = 2 A[2] = -6 A[3] = 4 A[4] = 0 the function should return 5 because: (3, 4) is a slice of A that has sum 4, (2, 2) is a slice of A that has sum −6, (0, 1) is a slice of A that has sum 5, no other slice of A has sum greater than (0, 1). Write an efficient algorithm for the following assumptions: N is an integer within the range [1..1,000,000]; each element of array A is an integer within the range [−1,000,000..1,000,000]; the result will be an integer within the range [−2,147,483,648..2,147,483,647].
Solution
Please solve this coding puzzleThe image you sent describes the Maximum Slice Problem. It is a coding challenge where you’re given an array of integers and you need to find the subarray (slice) with the largest sum. The slice can be empty, in which case the sum is zero.
Here’s how to solve this problem:
- Kadane’s Algorithm: There is an efficient algorithm to solve the maximum slice problem called Kadane’s Algorithm. It uses dynamic programming to find the maximum sum subarray in linear time (O(n)).
The idea is to keep track of two variables:
- current_sum: This variable stores the sum of the current subarray.
- max_sum: This variable stores the maximum sum of any subarray seen so far.
We iterate through the array and at each element, we do the following: * If the current_sum is less than zero, we restart the current subarray by setting it to the current element. * Otherwise, we add the current element to the current_sum. * We then compare the current_sum with the max_sum and update the max_sum if the current_sum is greater.
- Implementation: Here’s how you can implement Kadane’s Algorithm in Python:
def max_slice_sum(arr):
"""
Finds the maximum sum of a contiguous subarray in a given array.
Args:
arr: A list of integers.
Returns:
The maximum sum of a contiguous subarray in the array.
"""
current_sum = max_sum = arr[0]
for num in arr[1:]:
current_sum = max(num, current_sum + num)
max_sum = max(max_sum, current_sum)
return max_sum
This function iterates over the array ‘arr’ and calculates the maximum sum of a slice.
You can call the function like this:
arr = [5, -7, 8, -4, 1]
print(max_slice_sum(arr)) # Output: 8
In this example, the maximum sum of a slice is 8 (8).
I hope this helps!
Can you explain this code to a layman?Imagine you're walking a trail and collecting rocks. Each rock has a weight, positive if it's nice and heavy, negative if it's a drag. You want to find the biggest haul you can carry, but you can't go back and drop rocks you've already picked up.
This code is like figuring out the heaviest load you can carry on that trail.
Here's how it works:
-
Two piles:
- Current pile: This is the pile of rocks you're carrying right now. It starts with just the first rock on the trail (the first number in the list).
- Best pile ever: This keeps track of the heaviest pile you've found so far on the trail.
-
Walking the trail:
- You go through each rock on the trail (except the first one, since that's already in your current pile).
- At each rock, you check your current pile:
- If the current pile is too heavy (negative weight), you ditch it and start fresh with just this new rock.
- If the current pile is okay (positive or zero weight), you add this new rock to it.
- No matter what you did with the current pile, you then compare its weight to the "best pile ever." If the current pile is heavier, then the current pile becomes the new "best pile ever."
-
The answer:
- Once you've finished going through all the rocks, the "best pile ever" holds the heaviest load you could have carried.
The code uses fancy names for these piles ("current_sum" and "max_sum"), but that's the basic idea. It keeps track of the best possible haul you can make as you walk the trail (process the list).
Please listen carefully. What is the difference between following two functions: def golden_max_slice(A): max_ending = max_slice = 0 for a in A: max_ending = max(0, max_ending + a) max_slice = max(max_slice, max_ending) return max_slice def max_slice_sum(arr): """ Finds the maximum sum of a contiguous subarray in a given array. Args: arr: A list of integers. Returns: The maximum sum of a contiguous subarray in the array. """ current_sum = max_sum = arr[0] for num in arr[1:]: current_sum = max(num, current_sum + num) max_sum = max(max_sum, current_sum) return max_sum
The two functions you provided, golden_max_slice
and max_slice_sum
, both solve the same problem: finding the maximum sum of a contiguous subarray in a given list of integers. However, they achieve this in slightly different ways:
1. Variable Naming:
max_slice_sum
: This function uses more descriptive names.current_sum
: Clearly indicates the current sum of the subarray being considered.max_sum
: Represents the maximum sum found so far.
golden_max_slice
: Uses shorter names.max_ending
: Less descriptive but conveys the idea of the maximum sum ending at the current element.max_slice
: Same as inmax_slice_sum
.
2. Handling Negative Sums:
max_slice_sum
: This function explicitly restarts thecurrent_sum
to be the current element if it becomes negative. This ensures that the current subarray only considers elements that contribute positively to the sum.golden_max_slice
: This function implicitly handles negative sums. It only adds the current element tomax_ending
if the resulting sum would be positive. If the sum becomes negative,max_ending
effectively restarts at 0 (sincemax(0, a + b)
will be 0 for negative values of a).
In essence:
- Both functions implement Kadane's Algorithm.
max_slice_sum
might be easier to understand for beginners due to its clear variable names and explicit handling of negative sums.golden_max_slice
is a more concise approach that achieves the same result.
Efficiency:
Both functions have a time complexity of O(n), which is optimal for this problem as they iterate through the list once.
Tags: Technology,Algorithms,Python,Thursday, April 4, 2024
EquiLeader (A problem in the concept of Leader in DSA using Boyer-Moore Majority Vote Algo)
Tags: Algorithms,Technology,EquiLeader: Find the index S such that the leaders of the sequences A[0], A[1], ..., A[S] and A[S + 1], A[S + 2], ..., A[N - 1] are the same.
A non-empty array A consisting of N integers is given. The leader of this array is the value that occurs in more than half of the elements of A. An equi leader is an index S such that 0 ≤ S < N - 1 and two sequences A[0], A[1], ..., A[S] and A[S + 1], A[S + 2], ..., A[N - 1] have leaders of the same value. For example, given array A such that: A[0] = 4 A[1] = 3 A[2] = 4 A[3] = 4 A[4] = 4 A[5] = 2 we can find two equi leaders: 0, because sequences: (4) and (3, 4, 4, 4, 2) have the same leader, whose value is 4. 2, because sequences: (4, 3, 4) and (4, 4, 2) have the same leader, whose value is 4. The goal is to count the number of equi leaders. Write a function: def solution(A) that, given a non-empty array A consisting of N integers, returns the number of equi leaders. For example, given: A[0] = 4 A[1] = 3 A[2] = 4 A[3] = 4 A[4] = 4 A[5] = 2 the function should return 2, as explained above. Write an efficient algorithm for the following assumptions: N is an integer within the range [1..100,000]; each element of array A is an integer within the range [-1,000,000,000..1,000,000,000].Taking help from ChatGPT
Our prompt looks like this: You are a programming guide. Here is my problem... ...Please guide how to solve this problem step by step.To solve this problem efficiently, we can use a two-step approach:
- Find the dominator (leader) of the array.
- Count the number of equi leaders.
Here's a step-by-step guide on how to approach this problem:
Step 1: Find the Dominator
Initialize two variables,
candidate
andcount
, wherecandidate
will store the current candidate for the dominator andcount
will store the count of occurrences of the current candidate.Iterate through the array
A
:
- If
count
is 0, set the current element as thecandidate
and setcount
to 1.- If the current element is equal to the
candidate
, incrementcount
by 1.- If the current element is not equal to the
candidate
, decrementcount
by 1.After iterating through the array,
candidate
will hold a potential dominator candidate. We need to verify if it's indeed the dominator by counting its occurrences in the array.Count occurrences of
candidate
.
- If the count of occurrences of
candidate
is not more than half the length of the array, there is no dominator.Step 2: Count the Equi Leaders
def solution(A): candidate = None count = 0 # Find candidate for dominator for num in A: if count == 0: candidate = num count = 1 elif num == candidate: count += 1 else: count -= 1 # Count occurrences of candidate count_candidate = A.count(candidate) # If candidate is not a dominator, return 0 if count_candidate <= len(A) // 2: return 0 leader_count = 0 equi_leader_count = 0 for i, num in enumerate(A): if num == candidate: leader_count += 1 # Check if the leader appears more than half of the elements up to the current index if leader_count > (i + 1) // 2: # Check if the leader appears more than half of the remaining elements remaining_count = count_candidate - leader_count if remaining_count > (len(A) - i - 1) // 2: equi_leader_count += 1 return equi_leader_count # Example usage A = [4, 3, 4, 4, 4, 2] print(solution(A)) # Output should be 2
- Initialize
leader_count
to count the total occurrences of the dominator.- Initialize
equi_leader_count
to count the number of equi leaders.- Iterate through the array again:
- If the current element equals the dominator, increment
leader_count
.- Calculate the remaining count of the dominator from the end of the array.
- If the count of the dominator up to the current index is more than half of the elements seen so far, and the count of the dominator from the current index to the end is more than half of the remaining elements, increment
equi_leader_count
.What is the core idea behind this algo?
The core idea behind this algorithm is to first find the dominator (leader) of the array efficiently, and then utilize the properties of the dominator to count the number of equi leaders.
Finding the Dominator:
- The algorithm efficiently finds the dominator using the Boyer-Moore Majority Vote algorithm. This algorithm allows us to find a majority element (dominator) in linear time complexity with constant space complexity.
- By iterating through the array, the algorithm identifies a potential dominator candidate and verifies whether it's indeed the dominator by counting its occurrences in the array. If the count of occurrences is more than half of the length of the array, it confirms the presence of a dominator.
Counting the Equi Leaders:
- Once the dominator is found, the algorithm iterates through the array again to count both the total occurrences of the dominator and the number of equi leaders.
- It keeps track of the leader count as it iterates through the array. For each index, it checks if the count of the dominator up to that index is more than half of the elements seen so far. If it is, it also checks if the count of the dominator from the current index to the end is more than half of the remaining elements. If both conditions are met, it increments the equi leader count.
- This approach leverages the fact that an equi leader can only occur if the dominator occurs in both subsequences with counts greater than half of the respective lengths. Therefore, by efficiently tracking these counts, the algorithm can accurately count the equi leaders.
By following this two-step approach, the algorithm effectively solves the problem of finding the number of equi leaders in an array, utilizing the properties of the dominator to optimize the solution.
Is it correct the equi-leaders and dominator have the same face value?
Yes, you are correct. In the context of this problem, the equi-leaders and the dominator have the same face value. In the problem statement, an equi-leader is defined as an index S such that two sequences separated at index S have leaders of the same value. The leader of a sequence is defined as the value that occurs in more than half of the elements of that sequence. Since the equi-leader is a subset of the original array, and it's defined based on having the same leader value on both sides of the split, it follows that the dominator of the equi-leader subsequence is the same as the dominator of the original array. Therefore, in the context of this problem, the equi-leaders and the dominator indeed have the same face value.
Dominator (a problem on the concept of Leader in Algorithms using Boyer-Moore Majority Vote Algo)
Tags: Algorithms,Technology,Dominator: Find an index of an array such that its value occurs at more than half of indices in the array.
An array A consisting of N integers is given. The dominator of array A is the value that occurs in more than half of the elements of A. For example, consider array A such that A[0] = 3 A[1] = 4 A[2] = 3 A[3] = 2 A[4] = 3 A[5] = -1 A[6] = 3 A[7] = 3 The dominator of A is 3 because it occurs in 5 out of 8 elements of A (namely in those with indices 0, 2, 4, 6 and 7) and 5 is more than a half of 8. Write a function def solution(A) that, given an array A consisting of N integers, returns index of any element of array A in which the dominator of A occurs. The function should return -1 if array A does not have a dominator. For example, given array A such that A[0] = 3 A[1] = 4 A[2] = 3 A[3] = 2 A[4] = 3 A[5] = -1 A[6] = 3 A[7] = 3 the function may return 0, 2, 4, 6 or 7, as explained above. Write an efficient algorithm for the following assumptions: N is an integer within the range [0..100,000]; each element of array A is an integer within the range [−2,147,483,648..2,147,483,647].Taking help from ChatGPT...
Our prompt looks like this: You are a programming guide. An array A consisting of N integers is given... ... within the range [-2,147,483,648..2,147,483,647]. Please guide how to solve this problem step by step.Algo
To solve this problem, we can use the Boyer-Moore Majority Vote algorithm. The algorithm allows us to find a majority element (dominator in this case) in linear time complexity with constant space complexity. Here's how we can solve it step by step: 1. Initialize two variables, candidate and count, where candidate will store the current candidate for the dominator and count will store the count of occurrences of the current candidate. 2. Iterate through the array A: If count is 0, set the current element as the candidate and set count to 1. If the current element is equal to the candidate, increment count by 1. If the current element is not equal to the candidate, decrement count by 1. 3. After iterating through the array, candidate will hold a potential dominator candidate. We need to verify if it's indeed the dominator by counting its occurrences in the array. 4. Iterate through the array again to count the occurrences of the candidate. If the count of occurrences of candidate is greater than half the length of the array, return the index of any occurrence of candidate. If no dominator is found, return -1. Let's implement this in code:Code
def solution(A): candidate = None count = 0 # Find candidate for dominator for num in A: if count == 0: candidate = num count = 1 elif num == candidate: count += 1 else: count -= 1 # Count occurrences of candidate count_candidate = A.count(candidate) # Check if candidate is the dominator if count_candidate > len(A) // 2: return A.index(candidate) else: return -1 # Example usage A = [3, 4, 3, 2, 3, -1, 3, 3] print(solution(A)) # Output should be 0, 2, 4, 6, or 7
Manhattan stonewall problem on the concept of Stacks
Tags: Algorithms,Technology,StoneWall: Cover "Manhattan skyline" using the minimum number of rectangles.
You are going to build a stone wall. The wall should be straight and N meters long, and its thickness should be constant; however, it should have different heights in different places. The height of the wall is specified by an array H of N positive integers. H[I] is the height of the wall from I to I+1 meters to the right of its left end. In particular, H[0] is the height of the wall's left end and H[N−1] is the height of the wall's right end. The wall should be built of cuboid stone blocks (that is, all sides of such blocks are rectangular). Your task is to compute the minimum number of blocks needed to build the wall. Write a function: def solution(H) that, given an array H of N positive integers specifying the height of the wall, returns the minimum number of blocks needed to build it. For example, given array H containing N = 9 integers: H[0] = 8 H[1] = 8 H[2] = 5 H[3] = 7 H[4] = 9 H[5] = 8 H[6] = 7 H[7] = 4 H[8] = 8 the function should return 7. The figure shows one possible arrangement of seven blocks. Write an efficient algorithm for the following assumptions: N is an integer within the range [1..100,000]; each element of array H is an integer within the range [1..1,000,000,000].Taking help from ChatGPT...
Our prompt looks like: You are a programming guide. You are going to build a stone wall... ... the function should return 7. Please guide how to solve this problem step by step.Algo
To solve this problem, we can use a stack-based approach. The idea is to traverse through the array of heights and maintain a stack that keeps track of the heights encountered so far. We will also keep track of the number of blocks used. Here's how we can solve it step by step: 1. Initialize an empty stack and a variable to count the number of blocks used. 2. Iterate through the array of heights. 3. For each height, check whether it is greater than the top of the stack. If it is, push the height onto the stack. 4. If the height is less than the top of the stack, pop heights from the stack until the top of the stack is less than or equal to the current height. 5. For each height popped from the stack, increment the block count by 1, as each height represents a block. 6. If after popping from the stack, the stack is empty or the current height is greater than the top of the stack, push the current height onto the stack. 7. After iterating through all heights, return the block count. Let's implement this in code:Code
def solution(H): stack = [] # Stack to keep track of heights blocks_used = 0 # Counter for blocks used for height in H: # Pop heights from stack until top is less than or equal to current height while stack and stack[-1] > height: stack.pop() blocks_used += 1 # If stack is empty or current height is greater than top of stack, push height if not stack or height > stack[-1]: stack.append(height) # Add remaining heights in stack to block count blocks_used += len(stack) return blocks_used # Example usage H = [8, 8, 5, 7, 9, 8, 7, 4, 8] print(solution(H)) # Output should be 7 This algorithm has a time complexity of O(N), where N is the number of elements in the array H, as it iterates through the array once.
Friday, March 8, 2024
Voracious Fish (A problem on the concept of Stacks)
Fish: N voracious fish are moving along a river. Calculate how many fish are alive. You are given two non-empty arrays A and B consisting of N integers. Arrays A and B represent N voracious fish in a river, ordered downstream along the flow of the river. The fish are numbered from 0 to N − 1. If P and Q are two fish and P < Q, then fish P is initially upstream of fish Q. Initially, each fish has a unique position. Fish number P is represented by A[P] and B[P]. Array A contains the sizes of the fish. All its elements are unique. Array B contains the directions of the fish. It contains only 0s and/or 1s, where: 0 represents a fish flowing upstream, 1 represents a fish flowing downstream. If two fish move in opposite directions and there are no other (living) fish between them, they will eventually meet each other. Then only one fish can stay alive − the larger fish eats the smaller one. More precisely, we say that two fish P and Q meet each other when P < Q, B[P] = 1 and B[Q] = 0, and there are no living fish between them. After they meet: If A[P] > A[Q] then P eats Q, and P will still be flowing downstream, If A[Q] > A[P] then Q eats P, and Q will still be flowing upstream. We assume that all the fish are flowing at the same speed. That is, fish moving in the same direction never meet. The goal is to calculate the number of fish that will stay alive. For example, consider arrays A and B such that: A[0] = 4 B[0] = 0 A[1] = 3 B[1] = 1 A[2] = 2 B[2] = 0 A[3] = 1 B[3] = 0 A[4] = 5 B[4] = 0 Initially all the fish are alive and all except fish number 1 are moving upstream. Fish number 1 meets fish number 2 and eats it, then it meets fish number 3 and eats it too. Finally, it meets fish number 4 and is eaten by it. The remaining two fish, number 0 and 4, never meet and therefore stay alive. Write a function: def solution(A, B) that, given two non-empty arrays A and B consisting of N integers, returns the number of fish that will stay alive. For example, given the arrays shown above, the function should return 2, as explained above. Write an efficient algorithm for the following assumptions: N is an integer within the range [1..100,000]; each element of array A is an integer within the range [0..1,000,000,000]; each element of array B is an integer that can have one of the following values: 0, 1; the elements of A are all distinct. class Fish(): def __init__(self, size, direction): self.size = size self.direction = direction def solution(A, B): stack = [] survivors = 0 for i in range(len(A)): if B[i] == 1: stack.append(A[i]) else: weightdown = stack.pop() if stack else -1 while weightdown != -1 and weightdown < A[i]: weightdown = stack.pop() if stack else -1 if weightdown == -1: survivors += 1 else: stack.append(weightdown) return survivors + len(stack) Correctness tests ▶ extreme_small 1 or 2 fishes ▶ simple1 simple test ▶ simple2 simple test ▶ small_random small random test, N = ~100 Performance tests ▶ medium_random small medium test, N = ~5,000 ▶ large_random large random test, N = ~100,000 ▶ extreme_range1 all except one fish flowing in the same direction ▶ extreme_range2 all fish flowing in the same direction
Tuesday, March 5, 2024
Brackets (A problem related to Stacks)
Problem
Brackets: Determine whether a given string of parentheses (multiple types) is properly nested. A string S consisting of N characters is considered to be properly nested if any of the following conditions is true: S is empty; S has the form "(U)" or "[U]" or "{U}" where U is a properly nested string; S has the form "VW" where V and W are properly nested strings. For example, the string "{[()()]}" is properly nested but "([)()]" is not. Write a function: class Solution { public int solution(String S); } that, given a string S consisting of N characters, returns 1 if S is properly nested and 0 otherwise. For example, given S = "{[()()]}", the function should return 1 and given S = "([)()]", the function should return 0, as explained above. Write an efficient algorithm for the following assumptions: N is an integer within the range [0..200,000]; string S is made only of the following characters: '(', '{', '[', ']', '}' and/or ')'.Solution
Task Score: 87% Correctness: 100% Performance: 80% class Node(): def __init__(self, x): self.x = x self.next = None class Stack(): # head is default NULL def __init__(self): self.head = None # Checks if stack is empty def isempty(self): if self.head == None: return True else: return False def push(self, x): if self.head == None: self.head = Node(x) else: newnode = Node(x) newnode.next = self.head self.head = newnode def pop(self): if self.head == None: return None else: popped_node = self.head self.head = self.head.next popped_node.next = None return popped_node.x # Returns the head node data def peek(self): if self.isempty(): return None else: return self.head.x def solution(S): s = Stack() for i in range(len(S)): if S[i] in ['(', '{', '[']: s.push(S[i]) elif (S[i] == ')' and s.peek() == '(') or (S[i] == '}' and s.peek() == '{') or (S[i] == ']' and s.peek() == '['): s.pop() if s.isempty(): return 1 else: return 0
Number of disc intersections (Problem in sorting and searching)
Problem
Number Of Disc Intersections: Compute the number of intersections in a sequence of discs. We draw N discs on a plane. The discs are numbered from 0 to N − 1. An array A of N non-negative integers, specifying the radiuses of the discs, is given. The J-th disc is drawn with its center at (J, 0) and radius A[J]. We say that the J-th disc and K-th disc intersect if J ≠ K and the J-th and K-th discs have at least one common point (assuming that the discs contain their borders). The figure below shows discs drawn for N = 6 and A as follows: A[0] = 1 A[1] = 5 A[2] = 2 A[3] = 1 A[4] = 4 A[5] = 0 There are eleven (unordered) pairs of discs that intersect, namely: discs 1 and 4 intersect, and both intersect with all the other discs; disc 2 also intersects with discs 0 and 3. Write a function in Java with following signature: class Solution { public int solution(int[] A); } And in Python with following signature: def solution(A): that, given an array A describing N discs as explained above, returns the number of (unordered) pairs of intersecting discs. The function should return −1 if the number of intersecting pairs exceeds 10,000,000. Given array A shown above, the function should return 11, as explained above. Write an efficient algorithm for the following assumptions: N is an integer within the range [0..100,000]; each element of array A is an integer within the range [0..2,147,483,647].Solution
WITHOUT BINARY SEARCH
Task Score: 62% Correctness: 100% Performance: 25% ~~~ def solution(A): # Implement your solution here # Get the start and end indices of the disks and sort them based on the start indices start_and_end = [] for i in range(len(A)): temp = { 'start': i - A[i], 'end': i + A[i] } start_and_end.append(temp) start_and_end.sort(key = lambda d: d['start']) # Search for the starting index bigger than the current end index intersections = [] for i in range(len(start_and_end)): end = start_and_end[i]['end'] cnt_intersections = 0 for j in range(i+1, len(start_and_end)): if start_and_end[j]['start'] <= end: cnt_intersections += 1 else: break intersections.append(cnt_intersections) return sum(intersections) Ref
Thursday, January 25, 2024
Triangle formation from three sides (A problem on sorting technique)
Triangle
Determine whether a triangle can be built from a given set of edges.
Complexity: Easy
Problem
An array A consisting of N integers is given. A triplet (P, Q, R) is triangular if 0 ≤ P < Q < R < N and:
A[P] + A[Q] > A[R],
A[Q] + A[R] > A[P],
A[R] + A[P] > A[Q].
Write a function:
def solution(A)
that, given an array A consisting of N integers, returns 1 if there exists a triangular triplet for this array and returns 0 otherwise.
Write an efficient algorithm for the following assumptions:
N is an integer within the range [0..100,000];
each element of array A is an integer within the range [−2,147,483,648..2,147,483,647].
Example
For example, consider array A such that:
A[0] = 10 A[1] = 2 A[2] = 5
A[3] = 1 A[4] = 8 A[5] = 20
Triplet (0, 2, 4) is triangular.
For example, given array A such that:
A[0] = 10 A[1] = 2 A[2] = 5
A[3] = 1 A[4] = 8 A[5] = 20
the function should return 1, as explained above.
Given array A such that:
A[0] = 10 A[1] = 50 A[2] = 5, A[3] = 1
the function should return 0.
Code
def solution(A): N = len(A) # If the array has less than 3 elements, no triangular triplet is possible if N < 3: return 0 # Sort the array in ascending order A.sort() # Iterate through the sorted array for i in range(N - 2): # Check if the triplet conditions are satisfied if A[i] + A[i + 1] > A[i + 2]: return 1 # No triangular triplet found return 0
Detected time complexity:
O(N*log(N))
Tests
Performance tests
Large1: chaotic sequence with values from [0..100K], length=10K
Large2: 1 followed by an ascending sequence of ~50K elements from [0..100K], length=~50K
large_random: chaotic sequence of values from [0..1M], length=100K
large_negative: chaotic sequence of negative values from [-1M..-1], length=100K
large_negative2: chaotic sequence of negative values from [-10..-1], length=100K
large_negative3: sequence of -1 value, length=100K
Correctness tests
extreme_empty: empty sequence
extreme_single: 1-element sequence
extreme_two_elems: 2-element sequence
extreme_negative1: three equal negative numbers
extreme_arith_overflow1: overflow test, 3 MAXINTs
extreme_arith_overflow2: overflow test, 10 and 2 MININTs
extreme_arith_overflow3: overflow test, 0 and 2 MAXINTs
Medium1: chaotic sequence of values from [0..100K], length=30
Medium2: chaotic sequence of values from [0..1K], length=50
Medium3: chaotic sequence of values from [0..1K], length=100
Why this solution works?
# Check if the triplet conditions are satisfied. Given that A is sorted.
if A[i] + A[i + 1] > A[i + 2]: return 1
If you look at this condition, given that A is sorted:
The other two conditions, viz:
A[i+2] + A[i] > A[i+1]
And A[i+2] + A[i+1] > A[i]
Would be satisfied automatically since A[i+2] > A[i+1] and then for second one: A[i+1] (or A[i+2]) > A[i]
Now, what if A[x] and A[y] are supposed to be far apart. Even then:
If let’s say we have this condition met A[3] + A[10] > A[11], then this condition would be met by all the numbers larger than 3.
Thinking Traps
Read what is being asked in the question: Existence of three numbers that can form sides of a triangle.
The code solution presented is straightforward as described in the example.
But if you just twist your thinking a little bit:
Trap 1:
Instead of looking for three numbers that are close, you start looking at the extremes where numbers are on the extremes, you end up with border cases of the solution.
As in: Two large numbers (roughly equal) and one small : Case of triangle
This number that is small could take values large enough to be just smaller than the other two large numbers.
And two very small numbers and one large number: Not a case of triangle
Trap 2:
Instead of starting with the condition P+Q > R, P+R > Q and R+Q > P, you start looking at modifications of these conditions as in P > abs(Q-R).
What if a triplet exists but numbers are far apart?
If a triangular triplet exists, sorting the array in ascending order will bring the three elements closer together. This is because sorting the array ensures that the values are in non-decreasing order, and if a triplet exists, the smallest and middle values of the triplet will be adjacent after sorting.
Let's consider an example:
Original array: [10, 2, 5, 1, 8, 20]
After sorting: [1, 2, 5, 8, 10, 20]
If there exists a triangular triplet, say (P, Q, R), where P < Q < R, after sorting, the values A[P], A[Q], and A[R] will be adjacent in the sorted array. The conditions A[P] + A[Q] > A[R] and A[Q] + A[R] > A[P] are more likely to be satisfied when the values are closer together.
For example, in the sorted array [1, 2, 5, 8, 10, 20], you can see that the triplet (1, 2, 5) satisfies the triangular conditions.
Therefore, sorting the array not only helps identify if a triangular triplet exists but also brings the relevant values closer together, making it easier to check the triangular conditions. If a triplet exists but the values are far apart in the original array, sorting will still reveal the existence of a triplet.
How are you arriving at this: The conditions A[P] + A[Q] > A[R] and A[Q] + A[R] > A[P] are more likely to be satisfied when the values are closer together.
Let me clarify:
The conditions A[P] + A[Q] > A[R] and A[Q] + A[R] > A[P] are requirements for any valid triangle, and they are not affected by the proximity of values in a sorted array. These conditions are fundamental to the triangle inequality theorem.
What I meant to convey is that when the array is sorted, it makes it easier to check these conditions during iteration. Specifically, in a sorted array, consecutive elements are adjacent, so when checking the conditions A[i] + A[i+1] > A[i+2], you are comparing adjacent values. This simplifies the process of checking the triangular conditions as you move through the sorted array.
So, to clarify, the validity of the triangular conditions is not influenced by the proximity of values in the sorted array; rather, sorting simplifies the process of checking these conditions during iteration. I appreciate your understanding, and I hope this clarifies the point.
Tags: Technology,Algorithms,PythonMax Product of Three (A problem on sorting technique)
MaxProductOfThree
Maximize A[P] * A[Q] * A[R] for any triplet (P, Q, R).
Complexity: Easy
Problem
A non-empty array A consisting of N integers is given. The product of triplet (P, Q, R) equates to A[P] * A[Q] * A[R] (0 ≤ P < Q < R < N).
Your goal is to find the maximal product of any triplet.
Write a function:
def solution(A)
that, given a non-empty array A, returns the value of the maximal product of any triplet.
Write an efficient algorithm for the following assumptions:
N is an integer within the range [3..100,000];
each element of array A is an integer within the range [−1,000..1,000].
Example
For example, array A such that:
A[0] = -3
A[1] = 1
A[2] = 2
A[3] = -2
A[4] = 5
A[5] = 6
contains the following example triplets:
(0, 1, 2), product is −3 * 1 * 2 = −6
(1, 2, 4), product is 1 * 2 * 5 = 10
(2, 4, 5), product is 2 * 5 * 6 = 60
The function should return 60, as the product of triplet (2, 4, 5) is maximal.
Code
def solution(A): A.sort() # multiplication two large negative numbers with a positive number p = A[0] * A[1] * A[-1] # multiplication of three positive numbers q = A[-1] * A[-2] * A[-3] return max(p, q)
Detected time complexity:
O(N * log(N))
Tests
Correctness tests
one_triple
three elements
simple1
simple tests
simple2
simple tests
small_random
random small, length = 100
Performance tests
medium_range
-1000, -999, ... 1000, length = ~1,000
medium_random
random medium, length = ~10,000
large_random
random large, length = ~100,000
large_range
2000 * (-10..10) + [-1000, 500, -1]
extreme_large
(-2, .., -2, 1, .., 1) and (MAX_INT)..(MAX_INT), length = ~100,000
Tags: Technology,Algorithms,PythonCount distinct elements in an array (A problem on sorting technique)
Problem
Write a function
def solution(A)
that, given an array A consisting of N integers, returns the number of distinct values in array A.
For example, given array A consisting of six elements such that:
A[0] = 2 A[1] = 1 A[2] = 1
A[3] = 2 A[4] = 3 A[5] = 1
the function should return 3, because there are 3 distinct values appearing in array A, namely 1, 2 and 3.
Write an efficient algorithm for the following assumptions:
N is an integer within the range [0..100,000];
each element of array A is an integer within the range [−1,000,000..1,000,000].
Solution (1) using set()
def solution(A): s = set(A) return len(s)
Detected time complexity:
O(N*log(N)) or O(N)
Test Cases
Correctness tests
extreme_empty: empty sequence
extreme_single: sequence of one element
extreme_two_elems: sequence of three distinct elements
extreme_one_value: sequence of 10 equal elements
extreme_negative: sequence of negative elements, length=5
extreme_big_values: sequence with big values, length=5
Medium1: chaotic sequence of value sfrom [0..1K], length=100
Medium2: chaotic sequence of value sfrom [0..1K], length=200
Medium3: chaotic sequence of values from [0..10], length=200
Performance tests
chaotic sequence of values from [0..100K], length=10K
large_random1
chaotic sequence of values from [-1M..1M], length=100K
large_random2
another chaotic sequence of values from [-1M..1M], length=100K
Sol (2) using dict() and then keys().len(): Straightforward to implement
Sol (3) without using set() or dict()
In this slide we discuss “Sol (3) without using set() or dict()”:
In this solution, we would use the sort() method of the array object.
def solution(A): A.sort() rtn = 1 if len(A) == 0: return 0 else: for i in range(1, len(A)): if A[i] != A[i-1]: rtn += 1 return rtnTags: Technology,Algorithms,Python
Monday, January 15, 2024
Min Avg Two Slice (Problem on Prefix Sums With Medium Complexity)
Min Avg Two Slice: Find the minimal average of any slice containing at least two elements. (Complexity: Medium)
Problem
A non-empty array A consisting of N integers is given. A pair of integers (P, Q), such that 0 ≤ P < Q < N, is called a slice of array A (notice that the slice contains at least two elements). The average of a slice (P, Q) is the sum of A[P] + A[P + 1] + ... + A[Q] divided by the length of the slice. To be precise, the average equals (A[P] + A[P + 1] + ... + A[Q]) / (Q - P + 1).
The goal is to find the starting position of a slice whose average is minimal.
Write a function:
def solution(A)
that, given a non-empty array A consisting of N integers, returns the starting position of the slice with the minimal average. If there is more than one slice with a minimal average, you should return the smallest starting position of such a slice.
Write an efficient algorithm for the following assumptions:
N is an integer within the range [2..100,000];
each element of array A is an integer within the range [-10,000..10,000].
Example
For example, array A such that:
A[0] = 4
A[1] = 2
A[2] = 2
A[3] = 5
A[4] = 1
A[5] = 5
A[6] = 8
contains the following example slices:
slice (1, 2), whose average is (2 + 2) / 2 = 2;
slice (3, 4), whose average is (5 + 1) / 2 = 3;
slice (1, 4), whose average is (2 + 2 + 5 + 1) / 4 = 2.5.
So, the function should return the index of 1 in this case.
Hint (1)
To solve this problem efficiently, you can use the fact that the minimum average slice must be of length 2 or 3. Any larger slice can be split into smaller slices, and if the larger slice has a lower average, one of its smaller slices will also have a lower average.
Code (1)
def solution(A): N = len(A) min_avg = (A[0] + A[1]) / 2.0 min_start = 0 for i in range(2, N): # Calculate the average of a slice of length 2 avg_2 = (A[i - 1] + A[i]) / 2.0 if avg_2 < min_avg: min_avg = avg_2 min_start = i - 1 # Calculate the average of a slice of length 3 avg_3 = (A[i - 2] + A[i - 1] + A[i]) / 3.0 if avg_3 < min_avg: min_avg = avg_3 min_start = i - 2 return min_start A = [4, 2, 2, 5, 1, 5, 8] # Example run result = solution(A) print(result)
This algorithm has a time complexity of O(N) because it iterates through the array only once. The key observation is that the minimal average slice of length 2 or 3 will cover the minimal average slice of any larger length.
Can we redo this code to use the concept of prefix sums?
Certainly! You can use the concept of prefix sums to optimize the calculation of slice sums. The idea is to precompute the prefix sums of the array, and then use them to calculate slice sums in constant time. Here's the modified code:
Code (2)
def solution(A): N = len(A) # Calculate prefix sums prefix_sums = [0] * (N + 1) for i in range(1, N + 1): prefix_sums[i] = prefix_sums[i - 1] + A[i - 1] # Initialize the starting position to 0 and the minimum average to the average of the first two elements min_avg = (A[0] + A[1]) / 2.0 min_start = 0 for i in range(2, N): # Calculate the average of a slice of length 2 using prefix sums avg_2 = (prefix_sums[i] - prefix_sums[i - 2]) / 2.0 if avg_2 < min_avg: min_avg = avg_2 min_start = i - 1 # Calculate the average of a slice of length 3 using prefix sums avg_3 = (prefix_sums[i] - prefix_sums[i - 3]) / 3.0 if avg_3 < min_avg: min_avg = avg_3 min_start = i - 2 return min_start
Line of reasoning
Can you explain this reasoning:
To solve this problem efficiently, you can use the fact that the minimum average slice must be of length 2 or 3. Any larger slice can be split into smaller slices, and if the larger slice has a lower average, one of its smaller slices will also have a lower average.
The reasoning behind this observation is based on the idea that if you have a slice of length greater than 3 with a lower average, you can always find a sub-slice within it that has an even lower average. Let's break it down: Slice of length 2: There are only two elements, so the average is simply the sum of those two elements divided by 2. Slice of length 3: If you have a slice of length 3, there are only two sub-slices within it: one of length 2 and another of length 1. If the average of the slice of length 3 is lower than the average of the slice of length 2, then the average of the slice of length 1 will be even lower. Slice of length greater than 3: For a slice of length greater than 3, you can always find a sub-slice of length 2 or 3 within it. If the average of the larger slice is lower, then the average of one of its sub-slices (either of length 2 or 3) will be even lower. Therefore, it is sufficient to consider only slices of length 2 and 3 when searching for the minimum average slice. This observation simplifies the problem, as you don't need to consider all possible slice lengths. By focusing on slices of length 2 and 3, you can reduce the computational complexity of the algorithm.
Detected time complexity:O(N)
Tests
Correctness tests
double_quadruple
two or four elements
simple1
simple test, the best slice has length 3
simple2
simple test, the best slice has length 3
small_random
random, length = 100
medium_range
increasing, decreasing (legth = ~100) and small functional
Performance Tests
medium_random
random, N = ~700
large_ones
numbers from -1 to 1, N = ~100,000
large_random
random, N = ~100,000
extreme_values
all maximal values, N = ~100,000
large_sequence
many sequences, N = ~100,000
Tuesday, January 9, 2024
Genomic Range Query: Find the minimal nucleotide from a range of sequence DNA (Medium Complexity)
Problem
A DNA sequence can be represented as a string consisting of the letters A, C, G and T, which correspond to the types of successive nucleotides in the sequence. Each nucleotide has an impact factor, which is an integer. Nucleotides of types A, C, G and T have impact factors of 1, 2, 3 and 4, respectively. You are going to answer several queries of the form: What is the minimal impact factor of nucleotides contained in a particular part of the given DNA sequence?
The DNA sequence is given as a non-empty string S = S[0]S[1]...S[N-1] consisting of N characters. There are M queries, which are given in non-empty arrays P and Q, each consisting of M integers. The K-th query (0 ≤ K < M) requires you to find the minimal impact factor of nucleotides contained in the DNA sequence between positions P[K] and Q[K] (inclusive).
Write a function:
def solution(S, P, Q)
that, given a non-empty string S consisting of N characters and two non-empty arrays P and Q consisting of M integers, returns an array consisting of M integers specifying the consecutive answers to all queries.
Result array should be returned as an array of integers.
Write an efficient algorithm for the following assumptions:
N is an integer within the range [1..100,000];
M is an integer within the range [1..50,000];
each element of arrays P and Q is an integer within the range [0..N - 1];
P[K] ≤ Q[K], where 0 ≤ K < M;
string S consists only of upper-case English letters A, C, G, T.
Example
For example, consider string S = CAGCCTA and arrays P, Q such that:
P[0] = 2 Q[0] = 4
P[1] = 5 Q[1] = 5
P[2] = 0 Q[2] = 6
The answers to these M = 3 queries are as follows:
The part of the DNA between positions 2 and 4 contains nucleotides G and C (twice), whose impact factors are 3 and 2 respectively, so the answer is 2.
The part between positions 5 and 5 contains a single nucleotide T, whose impact factor is 4, so the answer is 4.
The part between positions 0 and 6 (the whole string) contains all nucleotides, in particular nucleotide A whose impact factor is 1, so the answer is 1.
The function should return the values [2, 4, 1], as explained above.
Brute Force Solution
def solution(S, P, Q): impact_factors = { 'A': 1, 'C': 2, 'G': 3, 'T': 4 } X = [impact_factors[i] for i in S] # Implement your solution here rtn = [] for i in range(len(P)): start = P[i] end = Q[i] + 1 s = X[start:end] rtn.append(min(s)) return rtn
Detected time complexity:
O(N * M)
Smart Solution Using Prefix Sums (Part 1)
def solution(S, P, Q): N = len(S) M = len(P) impact_factors = {'A': 1, 'C': 2, 'G': 3, 'T': 4} # Calculate prefix sums for each nucleotide type prefix_sums = [[0] * (N + 1) for _ in range(4)] for i in range(1, N + 1): for j in range(4): prefix_sums[j][i] = prefix_sums[j][i - 1] prefix_sums[impact_factors[S[i - 1]] - 1][i] += 1 print(i, S[i-1]) print(prefix_sums) print(':1 ~ ~ ~ ~ ~ ~ ~ ~ ~') S = 'CAGCCTA' P = [2, 5, 0] Q = [4, 5, 6] print(solution(S, P, Q)) # Output: [2, 4, 1] print(':2 ~ ~ ~ ~ ~ ~ ~ ~ ~') print(solution('AGCT', [], [])) print(':3 ~ ~ ~ ~ ~ ~ ~ ~ ~') print(solution('AACT', [], []))
This is a two part solution.
In the first part, we build prefix sum sequences for each nucleotide to track it’s occurrence.
Output (Part 1)
S = 'CAGCCTA'
:1 ~ ~ ~ ~ ~ ~ ~ ~ ~
1 C
[[0, 0, 0, 0, 0, 0, 0, 0], [0, 1, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0]]
2 A
[[0, 0, 1, 0, 0, 0, 0, 0], [0, 1, 1, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0]]
3 G
[[0, 0, 1, 1, 0, 0, 0, 0], [0, 1, 1, 1, 0, 0, 0, 0], [0, 0, 0, 1, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0]]
4 C
[[0, 0, 1, 1, 1, 0, 0, 0], [0, 1, 1, 1, 2, 0, 0, 0], [0, 0, 0, 1, 1, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0]]
5 C
[[0, 0, 1, 1, 1, 1, 0, 0], [0, 1, 1, 1, 2, 3, 0, 0], [0, 0, 0, 1, 1, 1, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0]]
6 T
[[0, 0, 1, 1, 1, 1, 1, 0], [0, 1, 1, 1, 2, 3, 3, 0], [0, 0, 0, 1, 1, 1, 1, 0], [0, 0, 0, 0, 0, 0, 1, 0]]
7 A
[[0, 0, 1, 1, 1, 1, 1, 2], [0, 1, 1, 1, 2, 3, 3, 3], [0, 0, 0, 1, 1, 1, 1, 1], [0, 0, 0, 0, 0, 0, 1, 1]]
Output (Part 2)
For 'AGCT'
:2 ~ ~ ~ ~ ~ ~ ~ ~ ~
1 A
[[0, 1, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]]
2 G
[[0, 1, 1, 0, 0], [0, 0, 0, 0, 0], [0, 0, 1, 0, 0], [0, 0, 0, 0, 0]]
3 C
[[0, 1, 1, 1, 0], [0, 0, 0, 1, 0], [0, 0, 1, 1, 0], [0, 0, 0, 0, 0]]
4 T
[[0, 1, 1, 1, 1], [0, 0, 0, 1, 1], [0, 0, 1, 1, 1], [0, 0, 0, 0, 1]]
For 'AACT'
:3 ~ ~ ~ ~ ~ ~ ~ ~ ~
1 A
[[0, 1, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]]
2 A
[[0, 1, 2, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]]
3 C
[[0, 1, 2, 2, 0], [0, 0, 0, 1, 0], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0]]
4 T
[[0, 1, 2, 2, 2], [0, 0, 0, 1, 1], [0, 0, 0, 0, 0], [0, 0, 0, 0, 1]]
Smart Solution Using Prefix Sums (Part 2)
def solution(S, P, Q): N = len(S) M = len(P) impact_factors = {'A': 1, 'C': 2, 'G': 3, 'T': 4} # Calculate prefix sums for each nucleotide type prefix_sums = [[0] * (N + 1) for _ in range(4)] for i in range(1, N + 1): for j in range(4): prefix_sums[j][i] = prefix_sums[j][i - 1] prefix_sums[impact_factors[S[i - 1]] - 1][i] += 1 # Process queries result = [] for k in range(M): start_position = P[k] end_position = Q[k] + 1 for nucleotide_type in range(4): if prefix_sums[nucleotide_type][end_position] - prefix_sums[nucleotide_type][start_position] > 0: result.append(nucleotide_type + 1) break return result # Example usage: S = 'CAGCCTA' P = [2, 5, 0] Q = [4, 5, 6] print(solution(S, P, Q)) # Output: [2, 4, 1]
In the second part of the solution we determine from start to end if a low factor nucleotide is present or not.
Here, let seq_X be sequence from as in prefix_sums[nucleotide_type]
It takes the form:
seq_A: [0, 0, 1, 1, 1, 1, 1, 2],
seq_C: [0, 1, 1, 1, 2, 3, 3, 3],
seq_G: [0, 0, 0, 1, 1, 1, 1, 1],
seq_T: [0, 0, 0, 0, 0, 0, 1, 1]
For input sequence: CAGCCTA
Then from any of this seq, we determine whether if A is appearing between start and end using seq_A[end] - seq_A[start]
Or if C is appearing between start and end using:
seq_C[end] – seq_C[start]
Or if G is appearing between start and end using:
seq_G[end] – seq_G[start]
Or if T is appearing between start and end using:
seq_T[end] – seq_T[start]
If a nucleotide is found in this second part of the code, we record this nucleotide and break out.
Time Complexity
Detected time complexity:
O(N + M)
Tests
Correctness tests
extreme_sinlge
single character string
extreme_double
double character string
simple
simple tests
small_length_string
small length simple string
small_random
small random string, length = ~300
Performance tests
almost_all_same_letters
GGGGGG..??..GGGGGG..??..GGGGGG
large_random
large random string, length
extreme_large
all max ranges
Explanation of Code Using Bard
I'd be glad to explain the code you provided:
Here's a breakdown of the code's key steps:
-
Initialization:
N
stores the length of the DNA sequenceS
.M
stores the number of queries (the length of arraysP
andQ
).impact_factors
maps each nucleotide to its corresponding impact factor.
-
Prefix Sums Calculation:
prefix_sums
is a 2D array with 4 rows (one for each nucleotide type) andN+1
columns. It's used to store the counts of each nucleotide type up to each position in the DNA sequence.- The nested loops iterate through the DNA sequence and update
prefix_sums
accordingly:- The first loop iterates through each position in the sequence.
- The second loop iterates through each nucleotide type.
- For each position and nucleotide type, the code copies the count from the previous position and then increments the count if the current nucleotide matches the type.
-
Query Processing:
result
is an empty list to store the answers to the queries.- The outer loop iterates through each query, accessing its start and end positions from
P
andQ
. - The inner loop iterates through each nucleotide type.
- It checks if the count of the current nucleotide type within the query range (
end_position
-start_position
) is greater than 0. If so, it means that nucleotide type is present within the range, and its impact factor is added to theresult
list. The loop then breaks because we've found the minimum impact factor for that query.
-
Returning Results:
- The function returns the
result
list containing the answers to all queries.
- The function returns the
Example Usage:
- The provided example demonstrates how to use the function with a sample DNA sequence
S
and query arraysP
andQ
. - The output
[2, 4, 1]
correctly represents the minimum impact factors for the specified query ranges inS
.
Key Points:
- The code efficiently calculates prefix sums to count the occurrences of each nucleotide type up to each position.
- This precomputation allows for quick calculation of minimum impact factors within any query range, using simple subtractions of prefix sums.
- The time complexity of the code is O(N + M), making it efficient for large DNA sequences and multiple queries.