Thursday, February 19, 2026

Hamming Sort - sorting based on count of differing bits in the binary representations of two integers (Easy)

Index of "Algorithms: Design and Analysis"
<<< Previous    Next >>>

Basic Programming > Bit Manipulation > Basics of Bit Manipulation

Solve on HackerEarth
Problem

Given an array of integers, denoted as A, and an integer K, your task is to implement a sorting algorithm. The goal is to arrange the elements in A based on their Hamming Distance from the integer K.

The Hamming Distance is defined as the count of differing bits in the binary representations of two integers.

The sorting should be done in ascending order of Hamming distance, and in case of a tie in Hamming Distances, the elements should be sorted in ascending numerical order.

Input Format:

Note: This is the input format that you must use to provide custom input (available above the Compile and Test button).

The first line contains T denoting the number of test cases. T also specifies the number of times you have to run the solve function on a different set of inputs. For each test case:

  • The first line contains the integer N and K.
  • The second line contains the array A of length N

Output Format: For each test case, print the answer in a new line. 

Constraints:

1T101N1051K1051A[i]105

Sample Input
1
3 2
4 5 6
Sample Output
6 4 5
Time Limit: 2
Memory Limit: 256
Source Limit:
Explanation

In the example, binary representation of K=2 is "10". Similarly for array [4,5,6], the binary representation is [100,101,110]. Now the hamming distance of each element of the array with K are [2,3,1]. Hence sorting on the basis of Hamming Distance, the final output comes out to be [6,4,5].



I am learning Data Structures and Algorithms.

I will provide a screenshot of a problem statement. 

Your task is NOT to directly give me the final code solution unless I explicitly ask for it.

Instead, please do the following:

1. First, carefully restate the problem in your own words to confirm correct understanding.
   - If anything is unclear in the screenshot, ask me clarification questions before proceeding.

2. Identify:
   - What category of problem this is (e.g., prefix sum, greedy, DP, two pointers, binary search, graph, etc.)
   - What core DSA concepts are being tested
   - What mathematical observations (if any) are involved
   - What constraints are important and how they affect the approach

3. Explain:
   - What naive/brute-force solution would look like
   - Why that may or may not work under the given constraints
   - What optimization insight is required

4. Clearly explain the key trick or insight needed to solve it.
   - Why does that trick work?
   - What pattern should I remember for future problems?

5. Provide:
   - The time complexity and space complexity of the optimal approach
   - Why it satisfies the constraints

6. If there are multiple valid approaches, compare them.

7. Only after all conceptual explanation, provide clean and correct Python code.

8. Avoid guessing unclear details.
   - If any part of the screenshot is ambiguous, ask me instead of assuming.

Focus on correctness over speed. Be explicit in reasoning.

1) Restating the problem (in my own words)

For each test case, you’re given:

  • an array A of N integers

  • an integer K

You must sort the array A using this key:

  1. Primary key: the Hamming distance between A[i] and K

  2. Tie-breaker: if two elements have the same Hamming distance, the smaller number comes first

Here, Hamming distance between two integers means:

the number of bit positions where their binary representations differ.

So you output the sorted array for each test case on a new line.

Nothing in the screenshot seems ambiguous enough to block solving, so I’ll proceed.


2) What type of problem is this?

Category

  • Sorting with a custom key

  • Uses bit manipulation (XOR + popcount)

Core DSA concepts being tested

  • Defining a correct comparison/key for sorting

  • Bitwise XOR

  • Counting set bits (popcount)

  • Handling constraints efficiently (N up to 1e5)

Key mathematical / bit observation

For integers x and K:

  • Bits differ exactly where x XOR K has a 1

  • Therefore:
    HammingDistance(x, K) = popcount(x XOR K)

This avoids manually comparing binary strings.

Important constraints & impact

  • 1 ≤ T ≤ 10

  • 1 ≤ N ≤ 10^5 per test

  • Values up to 10^5

Main cost is sorting: O(N log N) per test, which is fine at 1e5 scale.


3) Naive / brute-force approach

What brute force looks like

For each element:

  1. Convert x and K to binary strings

  2. Pad with leading zeros to equal length

  3. Count positions where they differ

  4. Sort by (distance, x)

Why it may be suboptimal

  • Binary conversion + padding per element adds overhead

  • Still ends up sorting, so you’re paying extra constant factors for no benefit

It might pass, but it’s not the cleanest or most robust approach under tight time limits.


4) The key trick / insight

Trick

Use:

  • d = (x ^ K) to mark differing bit positions

  • popcount(d) to count how many positions differ

So you sort by:

Plain text
(popcount(x XOR K), x)

Why it works

  • XOR outputs 1 exactly when bits differ

  • Counting 1s gives the number of differing positions → the Hamming distance

Pattern to remember

Whenever you see:

  • “differing bits”

  • “Hamming distance of integers”

  • “bit positions that differ”

Think immediately:

Hamming distance = popcount(a XOR b)


5) Complexity of the optimal approach

For each test case:

  • Compute distances: O(N) (each popcount is very fast)

  • Sort: O(N log N)

  • Extra space: O(N) if you store (key, value) pairs, or O(1) extra if sorting in-place with a key function (Python still allocates some overhead internally)

This satisfies constraints for N = 1e5.


6) Multiple valid approaches (comparison)

Approach A (recommended): sort with key = (popcount(x^K), x)

  • Simple, clean

  • Fast in practice

Approach B: stable sort twice

  1. Sort numerically

  2. Stable sort by Hamming distance

  • Works because Python sort is stable

  • Slightly more verbose, sometimes useful if tie-breaking rules get more complex

Approach A is the most direct.


7) Python code (after concepts)

Python
import sys

def popcount(x: int) -> int:
# Python 3.8+ has int.bit_count() (actually added in 3.8/3.10 depending on env);
# fall back safely if unavailable.
try:
return x.bit_count()
except AttributeError:
return bin(x).count("1")

def solve():
data = sys.stdin.buffer.read().split()
t = int(data[0])
idx = 1
out_lines = []

for _ in range(t):
n = int(data[idx]); k = int(data[idx + 1]); idx += 2
arr = list(map(int, data[idx:idx + n])); idx += n

arr.sort(key=lambda x: (popcount(x ^ k), x))
out_lines.append(" ".join(map(str, arr)))

sys.stdout.write("\n".join(out_lines))

if __name__ == "__main__":
solve()

If you want, I can also walk you through a dry run on the sample [4,5,6] with K=2 and show exactly how the keys become (2,4), (3,5), (1,6) → output 6 4 5.



modify your code to adapt to this boilerplate code:

def solve (N, K, A):
    # Write your code here
    pass

T = int(input())
for _ in range(T):
    custom_input_1 = list(map(str, input().split()))
    N = int(custom_input_1[0])
    K = int(custom_input_1[1])
    A = list(map(int, input().split()))

    out_ = solve(N, K, A)
    print (' '.join(map(str, out_)))

Sure — here’s the same logic adapted to your exact boilerplate (no extra I/O tricks, just implement solve):

Python
def solve(N, K, A):
# Sort by (HammingDistance(A[i], K), A[i])
# HammingDistance(x, K) = popcount(x XOR K)
A.sort(key=lambda x: ((x ^ K).bit_count(), x))
return A

T = int(input())
for _ in range(T):
custom_input_1 = list(map(str, input().split()))
N = int(custom_input_1[0])
K = int(custom_input_1[1])
A = list(map(int, input().split()))

out_ = solve(N, K, A)
print(' '.join(map(str, out_)))

If your Python version doesn’t support .bit_count(), tell me which version you’re using and I’ll swap in a compatible popcount.



can you explain this line a bit more clearly in layman terms:

A.sort(key=lambda x: ((x ^ K).bit_count(), x))

Absolutely.

What A.sort(key=...) means

sort() rearranges the list.
The key= part tells Python: “For each element, compute a value (a ‘sorting label’). Sort by that label instead of the element itself.”

So here, for every number x in A, we compute a pair (tuple):

Python
((x ^ K).bit_count(), x)

Python sorts tuples like this:

  1. Compare the first value

  2. If tied, compare the second value


Breaking down the key: ((x ^ K).bit_count(), x)

1) x ^ K (XOR)

XOR compares bits of x and K:

  • If a bit is the same → result bit is 0

  • If a bit is different → result bit is 1

So x ^ K produces a number whose 1 bits mark exactly the positions where x and K differ.

Example: x=4 (100), K=2 (010)

Code
100
^ 010
= 110

They differ in two positions → 110 has two 1s.

2) .bit_count()

This counts how many 1 bits are in the binary form of a number.

So:

Python
(x ^ K).bit_count()

= number of differing bit positions
= Hamming distance between x and K

3) , x (tie-breaker)

If two numbers have the same Hamming distance from K, we want the smaller number first — that’s why we include x as the second item in the tuple.


In plain English

This line tells Python:

“Sort the array by how many bits differ from K (fewest differences first).
If two numbers differ by the same amount, put the smaller number first.”


Tiny example

Let K = 2, A = [4, 5, 6]

Compute the key for each x:

  • x=4: 4^2 = 6 (110), bit_count = 2 → key (2, 4)

  • x=5: 5^2 = 7 (111), bit_count = 3 → key (3, 5)

  • x=6: 6^2 = 4 (100), bit_count = 1 → key (1, 6)

Sort by keys → (1,6), (2,4), (3,5) → result [6, 4, 5]

If you want, I can also rewrite that lambda into a named function (often clearer when learning).

No comments:

Post a Comment