Saturday, July 4, 2026

SQLite-Vector: Vector Search in Your Pocket

See All on GenAI    « Previously

Vector Search in Your Pocket

How sqlite-vector brings AI-powered similarity search to any device — no cloud required

Imagine you have a mobile app that needs to find the most similar image, the best product recommendation, or the right document from a pile of data — all while the user is offline. Traditionally, that would mean sending data to the cloud, running a heavy vector database, and waiting for results. But what if your SQLite database could do all of that, right on the device, with just 30 MB of memory and no indexing wait time? That's exactly what sqlite-vector delivers.

SQLite is already the world's most used database — it's in your phone, your browser, your car, and probably your smart fridge. Sqlite-vector is an extension that adds vector search to SQLite. In plain terms, it lets you store "embeddings" (think of them as mathematical fingerprints of images, text, or audio) and then find the closest matches at lightning speed — all using standard SQL.

Why vector search matters (and why you want it offline)

Modern AI models — from ChatGPT to image recognizers — turn everything into vectors: long lists of numbers that represent the "meaning" of a piece of data. When you want to find something similar, you don't search for exact matches; you search for the nearest neighbors in this high-dimensional space.

Think of it like finding the closest cities on a map — except the map has hundreds of dimensions. That's what vector search does, and it powers:

  • Semantic search — finding documents that are conceptually similar to your query
  • Image retrieval — showing visually similar photos
  • Recommendation systems — matching users with products, videos, or music
  • Voice and audio search — identifying sounds or voice queries
  • Anomaly detection — spotting outliers in sensor data

Until now, doing this on a phone or a low-power device was tricky. You'd need a separate vector database like FAISS or Weaviate, which often means running a server, setting up complex indexes, and waiting hours for preprocessing. Sqlite-vector flips that script.

What makes sqlite-vector different?

Most vector search tools are heavyweight. They require special virtual tables, pre‑indexing phases that can take hours, and external servers. Sqlite-vector takes a radically simpler approach:

Works with ordinary SQLite tables — no special schemas
No preindexing — start searching immediately
Zero‑cost updates — add or change vectors on the fly
Offline first — works without internet
Cross‑platform — iOS, Android, Windows, Linux, macOS
Memory‑efficient — just 30 MB RAM by default

It's built in pure C with SIMD acceleration, which means it runs blazingly fast even on mobile CPUs. And because it's just a SQLite extension, you can drop it into any existing project with minimal effort.

The secret sauce: TurboQuant

One of the coolest features is TurboQuant — a clever quantization technique inspired by a Google Research paper. Instead of storing full-precision vectors (which take up a lot of space), TurboQuant compresses them into 2‑bit, 3‑bit, or 4‑bit representations.

This dramatically reduces memory and storage while still keeping search results accurate. For example, on a dataset of 1 million vectors with 768 dimensions each, raw 32‑bit floats would take about 3 GB. TurboQuant 4‑bit shrinks that to just 396 MB — about 13% of the original size. And the search is still 15 times faster than brute force.

Here's a quick look at the performance on a Mac with ARM64 (NEON):

Mode Quantized storage Full scan / query TurboQuant / query Speedup Recall@10
TurboQuant 4‑bit 396 MB 3248 ms 218 ms 14.9× 0.84
TurboQuant 3‑bit 300 MB 1727 ms 188 ms 9.2× 0.74
TurboQuant 2‑bit 204 MB 3265 ms 85 ms 38.3× 0.48

The 4‑bit mode is a great starting point — it gives a solid balance of speed, memory, and accuracy. For really tight edge budgets, 2‑bit can be a lifesaver, though you'll want to test it with your own data.

Getting started (it's really this simple)

Sqlite-vector is available as a pre‑built binary for all major platforms — Linux, macOS, Windows, Android, and iOS. You can also load it as a WASM module for browsers.

Here's the basic flow in SQL:

-- 1. Load the extension
.load ./vector

-- 2. Create a regular table (no virtual tables needed!)
CREATE TABLE images (
    id INTEGER PRIMARY KEY,
    embedding BLOB,   -- store vectors as binary blobs
    label TEXT
);

-- 3. Insert a vector (as a blob or JSON array)
INSERT INTO images (embedding, label)
VALUES (vector_as_f32('[0.3, 1.0, 0.9, 3.2, ...]'), 'cat');

-- 4. Initialize the vector column
SELECT vector_init('images', 'embedding', 'type=FLOAT32,dimension=384');

-- 5. Quantize for blazing-fast search (TurboQuant 4‑bit)
SELECT vector_quantize('images', 'embedding', 'qtype=TURBO,qbits=4');

-- 6. Search for the top 20 nearest neighbors
SELECT e.id, v.distance
FROM images AS e
JOIN vector_quantize_scan('images', 'embedding', ?, 20) AS v
ON e.id = v.rowid;

That's it. No external servers, no complex indexing, no waiting. Your vector search is ready to go.

💡 Pro tip: You can also use vector_quantize_preload() to load the quantized data into memory for a 4‑5× speedup — perfect for interactive apps.

Where does it shine?

Sqlite-vector is built for Edge AI — scenarios where you need intelligence on the device, not in the cloud.

  • Mobile apps that do on‑device image search, face recognition, or voice commands
  • Privacy‑first applications where data never leaves the user's device
  • Offline‑first tools like note‑taking apps with semantic search
  • Embedded systems in robots, drones, or IoT devices

Because it's a SQLite extension, you also get all the benefits of a full relational database — transactions, joins, filters, and ACID guarantees — combined with vector search.

The bigger picture

Sqlite-vector is part of a larger ecosystem from SQLite AI that's turning SQLite into a complete runtime for intelligent, distributed data. There's also sqlite‑sync for offline‑first sync, sqlite‑ai for on‑device LLM inference, and sqlite‑agent for autonomous AI agents — all living inside your SQLite database.

If you don't want to manage it yourself, SQLite Cloud offers a hosted version with sync, auth, edge functions, and a free tier that gives you 512 MB and 20 connections — no credit card required.

Wrapping up

Sqlite-vector is a game‑changer for anyone building AI‑powered applications that need to work offline, on mobile, or at the edge. It's fast, tiny, and dead simple to use. You don't need to learn a new database or wrestle with complex indexing — just SELECT your way to similar items.

Whether you're building a photo app, a recommendation engine, or a privacy‑first search tool, sqlite‑vector gives you superpowers right inside your SQLite database. And with TurboQuant, you get enterprise‑grade performance on devices that fit in your pocket.

Ready to try it? Head over to the GitHub repository, grab the binary for your platform, and start searching in minutes. The era of on‑device AI is here — and it speaks SQL.

Resources: GitHub · Docs · SQLite AI · Releases

All performance numbers and benchmarks are from the project's official documentation and were measured on macOS ARM64 with the NEON backend. Your mileage may vary depending on hardware and data.

See All on GenAI    « Previously
Tags: Generative AI,Database

Over Rs 9,300 Crore in Forgotten EPF Accounts: Could Build 3 IITs

See All Articles


5 Key Takeaways

  • Unclaimed EPF funds total Rs 9,330 crore in 30.91 lakh dormant accounts as of March 2026.
  • Unclaimed amount decreased modestly from Rs 10,181 crore and 31.83 lakh accounts in 2025.
  • The new EPF Scheme 2026 was notified on June 29, 2026, aiming to simplify and digitize processes.
  • RTI response showed lack of historical data and refusal to disclose Aadhaar-linked account details due to fiduciary exemption.
  • Proactive outreach and individual action are needed to reunite workers with forgotten savings.



Idle Treasure: Over Rs 9,300 Crore Lies Unclaimed in 31 Lakh Dormant Provident Fund Accounts

Imagine a sum of money large enough to build three brand-new Indian Institutes of Technology — and still have more than Rs 500 crore left over. That is the staggering scale of retirement savings currently gathering dust in forgotten corners of India's largest social security organisation. A recent disclosure under the Right to Information (RTI) Act has revealed that as of March 31, 2026, a colossal Rs 9,330 crore sits unclaimed in 30.91 lakh inoperative Employees' Provident Fund (EPF) accounts across the country.

₹9,330 crore Unclaimed Balance
30.91 lakh Dormant Accounts
As of March 31, 2026 — Source: RTI Response to India Today

This revelation lands at a moment of considerable modernisation for the retirement savings system. On June 29, 2026, the government notified the Employees' Provident Fund Scheme, 2026, a sweeping new rulebook that replaces the nearly 74-year-old EPF Scheme of 1952. The new framework promises simplified rules and a more digital, subscriber-friendly experience for the nearly eight crore active members of the Employees' Provident Fund Organisation (EPFO). Yet even as the system looks ahead, the RTI data provides a sobering snapshot of how much of workers' hard-earned money remains stranded in the past.

What is the EPF, and why does money become unclaimed?

For millions of salaried employees in India, the EPF is the bedrock of retirement planning. A small portion of one's salary — currently 12 per cent from the employee and a matching 12 per cent from the employer — flows into the fund each month. Over a career, these contributions, along with interest declared by the government, compound into a substantial corpus meant to provide financial security after one stops working. The EPFO acts as the custodian of this money, managing accounts for workers in establishments ranging from factories to small offices.

An account becomes "inoperative" — the term the EPFO uses for what most of us would call dormant — when no contributions are made into it for a specified period, typically three consecutive years. This can happen for many reasons. A person might change jobs and fail to transfer the old EPF balance to the new employer's account. A worker could leave the formal workforce altogether, move abroad, or simply lose track of an old PF number received early in a career. In some tragic cases, the account holder may pass away without the family knowing about the savings. Whatever the cause, the money does not disappear; it continues to earn interest, but it sits outside the reach of its rightful owner until someone files a claim.

The numbers, and a modest improvement

The RTI response, accessed exclusively by India Today, lays bare the scale of idle money. According to the EPFO, there were precisely 30,91,862 inoperative accounts holding an unclaimed balance of approximately Rs 9,330 crore as on March 31, 2026. The figures show a modest improvement over the previous financial year. A year earlier, on March 31, 2025, 31.83 lakh such accounts existed with a collective unclaimed sum of Rs 10,181 crore. That means the number of dormant accounts dropped by about 92,000, and the total unclaimed amount shrank by Rs 851 crore over twelve months.

2025 ₹10,181 Cr 31.83 Lakh Accounts
2026 ₹9,330 Cr 30.91 Lakh Accounts
▼ Rs 851 crore reduction | 92,000 fewer dormant accounts

While any downward movement is welcome, the numbers remain vast. Nearly 31 lakh families — for behind every forgotten account is a worker or their dependents — have retirement savings they have not accessed. The problem is persistent and highlights a disconnect between the system's technological growth and the ground-level awareness among the workforce it serves.

A scale that demands comparison

Large numbers sometimes fail to register unless placed against relatable benchmarks. The unclaimed corpus of Rs 9,330 crore is almost exactly equal to the entire expenditure incurred by the central government on its UDAN regional connectivity scheme since its launch in 2016. That initiative, aimed at making air travel affordable for the common citizen through subsidised flights to smaller cities, has spent Rs 10,169 crore from its inception through early 2026. The dormant EPF money, in other words, matches one of the country's most visible infrastructure programmes.

✈️
Comparison One

The unclaimed EPF corpus of ₹9,330 crore nearly equals the entire expenditure on the UDAN Regional Connectivity Scheme since 2016 (₹10,169 crore) — an initiative that has transformed air travel accessibility across India.

Another striking parallel: the amount is nearly equivalent to the Union Budget's 2026-27 allocation for Ayushman Bharat–Pradhan Mantri Jan Arogya Yojana (PM-JAY), the flagship health insurance scheme that covers vulnerable families for secondary and tertiary hospitalisation. The money sitting idle could fund a year of that nationwide healthcare safety net.

🏥
Comparison Two

The dormant EPF money is nearly equivalent to the Union Budget's 2026-27 allocation for Ayushman Bharat–PM-JAY — funding a full year of India's flagship health insurance safety net for vulnerable families.

Perhaps the most vivid illustration comes from the education sector. In 2014, a government estimate pegged the cost of establishing a new Indian Institute of Technology at Rs 1,750 crore. Adjusting for inflation, that figure would be roughly Rs 2,934 crore in 2026. At that price, the Rs 9,330 crore lying unclaimed in EPF accounts could fully fund the construction of three new IITs — with over Rs 500 crore still left in the bank. These comparisons are, of course, illustrative. The unclaimed money belongs to individual workers and legally cannot be diverted for any other purpose. But they succeed in throwing the magnitude of the stranded retirement savings into sharp relief.

₹9,330 crore could build three brand-new IITs at 2026 inflation-adjusted costs — with over ₹500 crore still remaining.

The search for long-term trends hits a wall

To understand whether the pool of unclaimed money is shrinking or growing over time, the RTI applicant sought year-wise data for the last five financial years. The EPFO, however, said it could provide information only for the years 2025 and 2026. The reason reveals something about the internal architecture of the organisation. The EPFO stated that its Inoperative Accounts Cell (IAC) was established only during the financial year 2025-26, and that records for earlier years were not maintained by that cell. Without a longer historical trail, it becomes difficult to measure whether the recent reduction is an acceleration of a trend or simply a one-off correction.

Where the transparency hits a wall

The RTI application also delved into more granular details about the nature of these dormant accounts. The questions were pointed: how many of these inoperative accounts are linked with Aadhaar, what sum of money lies in those Aadhaar-linked accounts, and what is the status of auto-settlement for them? The EPFO declined to share this information, invoking Section 8(1)(e) of the RTI Act. This section exempts from disclosure any information held by a public authority in a fiduciary relationship — treating the EPFO as a trustee that owes a duty of confidentiality to account holders. While the legal basis is understandable, the decision leaves a significant information gap. Aadhaar linkage has been a central pillar of the government's strategy to streamline EPF processes, enabling easier portability and claim settlements. Knowing how much of the unclaimed stock belongs to Aadhaar-seeded accounts could help design targeted outreach programmes.

Another line of inquiry sought to identify high-value dormant accounts — specifically, the number of inoperative EPF accounts that hold more than Rs 5 lakh each. The EPFO's reply was straightforward: such data is not maintained in the format sought and therefore could not be furnished under the RTI Act. The absence of this breakdown makes it impossible to know whether the unclaimed billions are scattered across millions of small accounts or concentrated in a smaller number of large forgotten nest eggs. Both scenarios carry very different implications for policy intervention.

The new EPF Scheme, 2026: A fresh start?

All of this unfolds against the backdrop of the EPF Scheme, 2026, which came into force on June 29. The scheme, notified in the official Gazette, overhauls the archaic 1952 regulations that have governed provident funds since the early years of the Republic. The primary objectives are simplification and digitisation. For the common subscriber, this could eventually mean a more seamless process for tracking accounts, merging multiple PF numbers, and withdrawing money without having to run from pillar to post. A unified portal, stronger Aadhaar-based authentication, and push towards universal account portability are all part of the design.

If implemented well, the new scheme could address some of the very reasons accounts become inoperative in the first place. Many dormant accounts are not deliberately abandoned; they are simply lost in the transition between jobs. When a worker moved from one company to another before the UAN (Universal Account Number) system matured, they often ended up with multiple PF member IDs. Consolidating those IDs under a single UAN, which has been a multi-year project, is meant to solve exactly this issue. The 2026 scheme is expected to deepen that integration.

Yet, the existence of Rs 9,330 crore in unclaimed money even as these reforms roll out suggests that technological fixes alone cannot solve a problem rooted in awareness and behaviour. The EPFO has for years run campaigns urging members to link Aadhaar, seed bank accounts, and update nominee details. The scale of the remaining unclaimed pool indicates that millions still have not done so — or cannot, because they may not know an old account exists in their name.

What this means for workers and their families

Every rupee in those 30.91 lakh accounts belongs to someone. It could be the retirement cushion a factory worker intended to rely on after decades on the shop floor. It could be the nest egg a former IT professional forgot after switching companies twice in three years. It could be the savings of a person who passed away during the pandemic, leaving a family unaware that a PF account even existed. The system's failure to connect these dots is not merely administrative; it is deeply personal.

The EPFO has a mechanism for legal heirs to claim the PF balance of deceased members. But without awareness, the money remains untouched. Similarly, workers who exit the formal workforce often do not realise they can withdraw their full EPF balance after a waiting period of two months, or retain it and continue to earn interest. In the absence of active communication, inertia sets in, and the account slips into dormancy.

The modest decline in unclaimed balances between 2025 and 2026 — a drop of Rs 851 crore — is a sign that some remediation is working. Perhaps the EPFO's drive to settle pending claims and the increased use of the UAN for portability are beginning to bear fruit. But at this pace, clearing the entire backlog would take more than a decade, assuming no new accounts become inoperative along the way. That is a long time for a worker to wait for their own money.

The road ahead

The EPF Scheme, 2026, provides the legal architecture for a more agile and responsive provident fund system. What remains to be seen is how aggressively the EPFO will use its new mandate to reunite people with their savings. Proactive identification of dormant accounts, data-driven outreach using the Aadhaar and UAN infrastructure (while respecting privacy boundaries), and collaboration with employers to verify old records could all accelerate the process. The organisation might also consider publishing more granular data voluntarily — breaking down unclaimed amounts by age of the account, by sector, or by geography — to invite public scrutiny and support.

For the individual account holder or their dependents, the message is clear: it is worth spending half an hour checking if you, or a late family member, might have an old PF account floating in the system. The EPFO offers an online facility to search for a forgotten PF number using basic details like your name, date of birth, and past employer information. In a world of digital conveniences, the process is no longer the paperwork nightmare it once was. And with Rs 9,330 crore of your collective retirement money waiting in statutory trust, the prize for a little detective work is more than a curiosity — it could be a life-changing sum.

The RTI disclosure is a stark reminder that even a reformed and modernising system carries the weight of its past. The new scheme has turned the page. Now, the real work begins: ensuring that every rupee sitting idle in those millions of dormant accounts finds its way back to the worker who earned it.


Read more