Tuesday, June 16, 2026

Attempting to run a local LLM via LM Studio

Explain these logs... (logs attached at the end of the post)

The key error is this line:


ggml_vulkan: device Vulkan0 does not support 16-bit storage.
...
llama_model_load: error loading model: Unsupported device

Everything else is LM Studio repeatedly trying to load different GGUF models and failing for the same reason.

What's happening

LM Studio is using the Vulkan backend (GPU acceleration via llama.cpp/ggml).

When it starts loading a model, it detects your Vulkan-capable GPU:


MESA-INTEL: warning: Haswell Vulkan support is incomplete

This is already a warning sign. It means you're using an Intel Haswell-generation GPU (around 2013–2015), and Mesa's Vulkan implementation for it is incomplete.

Then:


ggml_vulkan: device Vulkan0 does not support 16-bit storage.

Modern llama.cpp Vulkan kernels require the Vulkan feature:


VK_KHR_16bit_storage

or equivalent 16-bit storage support.

Your GPU/driver does not provide it.

As soon as ggml checks the GPU capabilities, model initialization aborts:


error loading model: Unsupported device

Why every model fails

You tried multiple models:

LFM2.5-1.2B-Instruct-Q6_K
gemma-3-270m-it-qat-Q4_0
LFM2-350M-Q4_0

All fail identically.

That tells us:

✅ Models are not corrupted

✅ Quantization is not the issue

✅ Context size is not the issue

✅ RAM is probably not the issue

❌ GPU backend initialization is failing before model loading completes

The important clues

Clue 1


MESA-INTEL: warning: Haswell Vulkan support is incomplete

Haswell Intel GPUs are:

HD 4200
HD 4400
HD 4600
Iris 5100
Iris Pro 5200

These GPUs have only experimental Vulkan support through Mesa's ANV driver.

Clue 2


device Vulkan0 does not support 16-bit storage

llama.cpp Vulkan backend requires GPU features that Haswell generally lacks.

Therefore LM Studio cannot use Vulkan on this hardware.

The strange token warning

This warning is unrelated:


control-looking token: 212 '</s>' was not control-type

This comes from the Gemma GGUF metadata.

It's usually harmless and not the reason for failure.

Most likely setup

You're probably running:

Linux
Intel Haswell integrated graphics
Mesa Vulkan driver
LM Studio attempting GPU acceleration

The GPU backend is unsupported.

How to verify

Run:

Bash
lspci | grep VGA

and

Bash
vulkaninfo | grep "deviceName"

or:

Bash
glxinfo | grep "OpenGL renderer"

You'll likely see something like:


Intel HD Graphics 4600

or another Haswell GPU.

Fixes

Option 1: Disable GPU acceleration (recommended)

In LM Studio:

Settings
Runtime / Hardware
Disable Vulkan GPU acceleration
Force CPU inference

or set GPU layers to:

Then reload the model.

For small models like:

270M
350M
1.2B

CPU inference should work fine.

Option 2: Update Mesa

Check:

Bash
mesa --version

or:

Bash
vulkaninfo

A newer Mesa may expose more features, but Haswell hardware itself generally lacks the required capability, so don't expect this to solve it.

Option 3: Use OpenCL/CPU backend

If LM Studio offers:

CPU
OpenBLAS
CLBlast/OpenCL

use those instead of Vulkan.

Option 4: Use a newer GPU

Any reasonably modern GPU supports the needed Vulkan features:

Intel Xe
Intel Arc
AMD RDNA
NVIDIA GTX 10xx and newer

Bottom line

Your models are failing because LM Studio is trying to use the Vulkan backend on an older Intel Haswell GPU, and that GPU does not support Vulkan 16-bit storage, which the llama.cpp Vulkan implementation requires:


ggml_vulkan: device Vulkan0 does not support 16-bit storage

The practical fix is to disable GPU/Vulkan acceleration and run the models on CPU, or use newer graphics hardware.

(base) jain@jain-ThinkPad-T440s:~$ lspci | grep VGA
00:02.0 VGA compatible controller: Intel Corporation Haswell-ULT Integrated Graphics Controller (rev 0b)

That confirms it.

Your machine has:


Intel Corporation Haswell-ULT Integrated Graphics Controller

On a ThinkPad T440s this is typically one of:

Intel HD Graphics 4400
Intel HD Graphics 5000

These are 4th-generation Intel GPUs (2013-era Haswell).

Why LM Studio fails

The log showed:


ggml_vulkan: device Vulkan0 does not support 16-bit storage

and


error loading model: Unsupported device

The Vulkan backend used by LM Studio (via llama.cpp/ggml) expects GPU features that Haswell iGPUs do not provide. Even though Mesa exposes Vulkan support, it is incomplete:


MESA-INTEL: warning: Haswell Vulkan support is incomplete

So LM Studio sees a Vulkan device, tries to use it, then aborts when the required feature checks fail.

What you should do

Disable GPU acceleration completely

In LM Studio:

Open Settings
Go to Runtime or Hardware Acceleration
Select CPU instead of Vulkan
Set GPU Offload Layers to 0
Reload the model

For your hardware this is the correct configuration.

Can the T440s still run local models?

Yes, but on CPU.

The T440s usually has:

2-core/4-thread or 2-core/4-thread Haswell-U CPU
8–12 GB RAM (sometimes 16 GB)

Reasonable models:

Model	Status
Gemma 3 270M	Excellent
LFM2 350M	Excellent
TinyLlama 1.1B	Good
Qwen 2.5 0.5B	Good
Qwen 2.5 1.5B Q4	Usable
Phi-3 Mini Q4	Slow but usable
7B models	Very slow

Verify LM Studio is using CPU

After changing settings, load a model and check the logs.

You should not see:


ggml_vulkan


Vulkan0

Instead you'll see messages mentioning CPU buffers, threads, and model tensors loading successfully.

If LM Studio still insists on Vulkan

Run:

Bash
LIBGGML_VULKAN=0 lmstudio

Bash
GGML_VULKAN_DISABLE=1 lmstudio

depending on how LM Studio is installed.

If you're using the AppImage, tell me:

Bash
which lmstudio

or how you installed it (AppImage, .deb, Flatpak, etc.), and I can give the exact command to force CPU mode.

Complete Logs


Developer Logs
2026-06-16 18:43:30 [DEBUG]
 LlamaV4::load called with model path: /home/jain/.lmstudio/models/lmstudio-community/LFM2.5-1.2B-Instruct-GGUF/LFM2.5-1.2B-Instruct-Q6_K.gguf
LlamaV4::load config: n_parallel=4 n_ctx=8192 kv_unified=true
2026-06-16 18:43:30 [DEBUG]
 MESA-INTEL: warning: Haswell Vulkan support is incomplete
2026-06-16 18:43:30 [DEBUG]
 0.00.109.877 I srv    load_model: loading model '/home/jain/.lmstudio/models/lmstudio-community/LFM2.5-1.2B-Instruct-GGUF/LFM2.5-1.2B-Instruct-Q6_K.gguf'
2026-06-16 18:43:31 [DEBUG]
 ggml_vulkan: device Vulkan0 does not support 16-bit storage.
2026-06-16 18:43:31 [DEBUG]
 0.00.349.937 E llama_model_load: error loading model: Unsupported device
0.00.349.964 E llama_model_load_from_file_impl: failed to load model
0.00.349.965 E common_init_from_params: failed to load model '/home/jain/.lmstudio/models/lmstudio-community/LFM2.5-1.2B-Instruct-GGUF/LFM2.5-1.2B-Instruct-Q6_K.gguf'
0.00.349.968 E srv    load_model: failed to load model, '/home/jain/.lmstudio/models/lmstudio-community/LFM2.5-1.2B-Instruct-GGUF/LFM2.5-1.2B-Instruct-Q6_K.gguf': error loading model: Unsupported device
2026-06-16 18:43:31 [DEBUG]
 [LLMProcess] Failed to load model _0x3f9935 [Error]: Failed to load model.
    at _0x3f7ad2.loadModel (/opt/LM-Studio/resources/app/.webpack/lib/llmworker.js:1:562652)
    at process.processTicksAndRejections (node:internal/process/task_queues:104:5)
    at async _0x3f7ad2.handleMessage (/opt/LM-Studio/resources/app/.webpack/lib/llmworker.js:1:554788) {
  cause: 'Failed to load model',
  suggestion: undefined,
  errorData: undefined,
  data: undefined,
  displayData: undefined,
  title: 'Failed to load model.'
}
2026-06-16 18:43:49 [DEBUG]
 LlamaV4::load called with model path: /home/jain/.lmstudio/models/lmstudio-community/LFM2.5-1.2B-Instruct-GGUF/LFM2.5-1.2B-Instruct-Q6_K.gguf
LlamaV4::load config: n_parallel=4 n_ctx=8192 kv_unified=true
2026-06-16 18:43:49 [DEBUG]
 MESA-INTEL: warning: Haswell Vulkan support is incomplete
2026-06-16 18:43:49 [DEBUG]
 0.00.160.092
2026-06-16 18:43:49 [DEBUG]
 I srv    load_model: loading model '/home/jain/.lmstudio/models/lmstudio-community/LFM2.5-1.2B-Instruct-GGUF/LFM2.5-1.2B-Instruct-Q6_K.gguf'
2026-06-16 18:43:49 [DEBUG]
 ggml_vulkan: device Vulkan0 does not support 16-bit storage.
2026-06-16 18:43:50 [DEBUG]
 0.00.405.523 E llama_model_load: error loading model: Unsupported device
0.00.405.543 E llama_model_load_from_file_impl: failed to load model
0.00.405.545 E common_init_from_params: failed to load model '/home/jain/.lmstudio/models/lmstudio-community/LFM2.5-1.2B-Instruct-GGUF/LFM2.5-1.2B-Instruct-Q6_K.gguf'
0.00.405.547 E srv    load_model: failed to load model, '/home/jain/.lmstudio/models/lmstudio-community/LFM2.5-1.2B-Instruct-GGUF/LFM2.5-1.2B-Instruct-Q6_K.gguf': error loading model: Unsupported device
2026-06-16 18:43:50 [DEBUG]
 [LLMProcess] Failed to load model _0x3f9935 [Error]: Failed to load model.
    at _0x3f7ad2.loadModel (/opt/LM-Studio/resources/app/.webpack/lib/llmworker.js:1:562652)
    at process.processTicksAndRejections (node:internal/process/task_queues:104:5)
    at async _0x3f7ad2.handleMessage (/opt/LM-Studio/resources/app/.webpack/lib/llmworker.js:1:554788) {
  cause: 'Failed to load model',
  suggestion: undefined,
  errorData: undefined,
  data: undefined,
  displayData: undefined,
  title: 'Failed to load model.'
}
2026-06-16 18:44:03 [DEBUG]
 LlamaV4::load called with model path: /home/jain/.lmstudio/models/lmstudio-community/LFM2.5-1.2B-Instruct-GGUF/LFM2.5-1.2B-Instruct-Q6_K.gguf
LlamaV4::load config: n_parallel=4 n_ctx=8192 kv_unified=true
2026-06-16 18:44:03 [DEBUG]
 MESA-INTEL: warning: Haswell Vulkan support is incomplete
2026-06-16 18:44:03 [DEBUG]
 0.00.071.932 I srv    load_model: loading model '/home/jain/.lmstudio/models/lmstudio-community/LFM2.5-1.2B-Instruct-GGUF/LFM2.5-1.2B-Instruct-Q6_K.gguf'
2026-06-16 18:44:03 [DEBUG]
 ggml_vulkan: device Vulkan0 does not support 16-bit storage.
2026-06-16 18:44:03 [DEBUG]
 0.00.269.473 E llama_model_load: error loading model: Unsupported device
0.00.269.499 E llama_model_load_from_file_impl: failed to load model
0.00.269.500 E common_init_from_params: failed to load model '/home/jain/.lmstudio/models/lmstudio-community/LFM2.5-1.2B-Instruct-GGUF/LFM2.5-1.2B-Instruct-Q6_K.gguf'
0.00.269.502 E srv    load_model: failed to load model, '/home/jain/.lmstudio/models/lmstudio-community/LFM2.5-1.2B-Instruct-GGUF/LFM2.5-1.2B-Instruct-Q6_K.gguf': error loading model: Unsupported device
2026-06-16 18:44:03 [DEBUG]
 [LLMProcess] Failed to load model _0x3f9935 [Error]: Failed to load model.
    at _0x3f7ad2.loadModel (/opt/LM-Studio/resources/app/.webpack/lib/llmworker.js:1:562652)
    at process.processTicksAndRejections (node:internal/process/task_queues:104:5)
    at async _0x3f7ad2.handleMessage (/opt/LM-Studio/resources/app/.webpack/lib/llmworker.js:1:554788) {
  cause: 'Failed to load model',
  suggestion: undefined,
  errorData: undefined,
  data: undefined,
  displayData: undefined,
  title: 'Failed to load model.'
}
2026-06-16 18:47:29 [DEBUG]
 LlamaV4::load called with model path: /home/jain/.lmstudio/models/lmstudio-community/gemma-3-270m-it-qat-GGUF/gemma-3-270m-it-qat-Q4_0.gguf
LlamaV4::load config: n_parallel=4 n_ctx=8192 kv_unified=true
2026-06-16 18:47:29 [DEBUG]
 MESA-INTEL: warning: Haswell Vulkan support is incomplete
2026-06-16 18:47:29 [DEBUG]
 Applying legacy swa_full=true default for arch gemma3
0.00.110.637 I srv    load_model: loading model '/home/jain/.lmstudio/models/lmstudio-community/gemma-3-270m-it-qat-GGUF/gemma-3-270m-it-qat-Q4_0.gguf'
2026-06-16 18:47:30 [DEBUG]
 0.00.668.125 W load: control-looking token:    212 '' was not control-type; this is probably a bug in the model. its type will be overridden
2026-06-16 18:47:30 [DEBUG]
 ggml_vulkan: device Vulkan0 does not support 16-bit storage.
2026-06-16 18:47:30 [DEBUG]
 0.00.836.575 E llama_model_load: error loading model: Unsupported device
0.00.836.602 E llama_model_load_from_file_impl: failed to load model
0.00.836.603 E common_init_from_params: failed to load model '/home/jain/.lmstudio/models/lmstudio-community/gemma-3-270m-it-qat-GGUF/gemma-3-270m-it-qat-Q4_0.gguf'
0.00.836.606 E srv    load_model: failed to load model, '/home/jain/.lmstudio/models/lmstudio-community/gemma-3-270m-it-qat-GGUF/gemma-3-270m-it-qat-Q4_0.gguf': error loading model: Unsupported device
2026-06-16 18:47:30 [DEBUG]
 [LLMProcess] Failed to load model _0x3f9935 [Error]: Failed to load model.
    at _0x3f7ad2.loadModel (/opt/LM-Studio/resources/app/.webpack/lib/llmworker.js:1:562652)
    at process.processTicksAndRejections (node:internal/process/task_queues:104:5)
    at async _0x3f7ad2.handleMessage (/opt/LM-Studio/resources/app/.webpack/lib/llmworker.js:1:554788) {
  cause: 'Failed to load model',
  suggestion: undefined,
  errorData: undefined,
  data: undefined,
  displayData: undefined,
  title: 'Failed to load model.'
}
2026-06-16 18:47:58 [DEBUG]
 LlamaV4::load called with model path: /home/jain/.lmstudio/models/lmstudio-community/gemma-3-270m-it-qat-GGUF/gemma-3-270m-it-qat-Q4_0.gguf
LlamaV4::load config: n_parallel=4 n_ctx=8192 kv_unified=true
2026-06-16 18:47:58 [DEBUG]
 MESA-INTEL: warning: Haswell Vulkan support is incomplete
2026-06-16 18:47:58 [DEBUG]
 Applying legacy swa_full=true default for arch gemma3
0.00.107.554 I srv    load_model: loading model '/home/jain/.lmstudio/models/lmstudio-community/gemma-3-270m-it-qat-GGUF/gemma-3-270m-it-qat-Q4_0.gguf'
2026-06-16 18:47:59 [DEBUG]
 0.00.632.439 W load: control-looking token:    212 '' was not control-type; this is probably a bug in the model. its type will be overridden
2026-06-16 18:47:59 [DEBUG]
 ggml_vulkan: device Vulkan0 does not support 16-bit storage.
2026-06-16 18:47:59 [DEBUG]
 0.00.802.706 E llama_model_load: error loading model: Unsupported device
0.00.802.735 E llama_model_load_from_file_impl: failed to load model
0.00.802.737 E common_init_from_params: failed to load model '/home/jain/.lmstudio/models/lmstudio-community/gemma-3-270m-it-qat-GGUF/gemma-3-270m-it-qat-Q4_0.gguf'
0.00.802.739 E srv    load_model: failed to load model, '/home/jain/.lmstudio/models/lmstudio-community/gemma-3-270m-it-qat-GGUF/gemma-3-270m-it-qat-Q4_0.gguf': error loading model: Unsupported device
2026-06-16 18:47:59 [DEBUG]
 [LLMProcess] Failed to load model _0x3f9935 [Error]: Failed to load model.
    at _0x3f7ad2.loadModel (/opt/LM-Studio/resources/app/.webpack/lib/llmworker.js:1:562652)
    at process.processTicksAndRejections (node:internal/process/task_queues:104:5)
    at async _0x3f7ad2.handleMessage (/opt/LM-Studio/resources/app/.webpack/lib/llmworker.js:1:554788) {
  cause: 'Failed to load model',
  suggestion: undefined,
  errorData: undefined,
  data: undefined,
  displayData: undefined,
  title: 'Failed to load model.'
}
2026-06-16 18:53:59 [DEBUG]
 LlamaV4::load called with model path: /home/jain/.lmstudio/models/lmstudio-community/gemma-3-270m-it-qat-GGUF/gemma-3-270m-it-qat-Q4_0.gguf
LlamaV4::load config: n_parallel=4 n_ctx=8192 kv_unified=true
2026-06-16 18:53:59 [DEBUG]
 MESA-INTEL: warning: Haswell Vulkan support is incomplete
2026-06-16 18:53:59 [DEBUG]
 Applying legacy swa_full=true default for arch gemma3
0.00.161.414 I srv    load_model: loading model '/home/jain/.lmstudio/models/lmstudio-community/gemma-3-270m-it-qat-GGUF/gemma-3-270m-it-qat-Q4_0.gguf'
2026-06-16 18:54:00 [DEBUG]
 0.00.813.256
2026-06-16 18:54:00 [DEBUG]
 W load: control-looking token:    212 '' was not control-type; this is probably a bug in the model. its type will be overridden
2026-06-16 18:54:00 [DEBUG]
 ggml_vulkan: device Vulkan0 does not support 16-bit storage.
2026-06-16 18:54:00 [DEBUG]
 0.01.092.471 E llama_model_load: error loading model: Unsupported device
0.01.092.498 E llama_model_load_from_file_impl: failed to load model
0.01.092.500 E common_init_from_params: failed to load model '/home/jain/.lmstudio/models/lmstudio-community/gemma-3-270m-it-qat-GGUF/gemma-3-270m-it-qat-Q4_0.gguf'
0.01.092.502 E srv    load_model: failed to load model, '/home/jain/.lmstudio/models/lmstudio-community/gemma-3-270m-it-qat-GGUF/gemma-3-270m-it-qat-Q4_0.gguf': error loading model: Unsupported device
2026-06-16 18:54:00 [DEBUG]
 [LLMProcess] Failed to load model _0x3f9935 [Error]: Failed to load model.
    at _0x3f7ad2.loadModel (/opt/LM-Studio/resources/app/.webpack/lib/llmworker.js:1:562652)
    at process.processTicksAndRejections (node:internal/process/task_queues:104:5)
    at async _0x3f7ad2.handleMessage (/opt/LM-Studio/resources/app/.webpack/lib/llmworker.js:1:554788) {
  cause: 'Failed to load model',
  suggestion: undefined,
  errorData: undefined,
  data: undefined,
  displayData: undefined,
  title: 'Failed to load model.'
}
2026-06-16 18:56:57 [DEBUG]
 LlamaV4::load called with model path: /home/jain/.lmstudio/models/lmstudio-community/gemma-3-270m-it-qat-GGUF/gemma-3-270m-it-qat-Q4_0.gguf
LlamaV4::load config: n_parallel=4 n_ctx=8192 kv_unified=true
2026-06-16 18:56:58 [DEBUG]
 MESA-INTEL: warning: Haswell Vulkan support is incomplete
2026-06-16 18:56:58 [DEBUG]
 Applying legacy swa_full=true default for arch gemma3
0.00.113.876 I srv    load_model: loading model '/home/jain/.lmstudio/models/lmstudio-community/gemma-3-270m-it-qat-GGUF/gemma-3-270m-it-qat-Q4_0.gguf'
2026-06-16 18:56:58 [DEBUG]
 0.00.664.116 W load: control-looking token:    212 '' was not control-type; this is probably a bug in the model. its type will be overridden
2026-06-16 18:56:58 [DEBUG]
 ggml_vulkan: device Vulkan0 does not support 16-bit storage.
2026-06-16 18:56:58 [DEBUG]
 0.00.832.215 E llama_model_load: error loading model: Unsupported device
0.00.832.233 E llama_model_load_from_file_impl: failed to load model
0.00.832.234 E common_init_from_params: failed to load model '/home/jain/.lmstudio/models/lmstudio-community/gemma-3-270m-it-qat-GGUF/gemma-3-270m-it-qat-Q4_0.gguf'
0.00.832.237 E srv    load_model: failed to load model, '/home/jain/.lmstudio/models/lmstudio-community/gemma-3-270m-it-qat-GGUF/gemma-3-270m-it-qat-Q4_0.gguf': error loading model: Unsupported device
2026-06-16 18:56:58 [DEBUG]
 [LLMProcess] Failed to load model _0x3f9935 [Error]: Failed to load model.
    at _0x3f7ad2.loadModel (/opt/LM-Studio/resources/app/.webpack/lib/llmworker.js:1:562652)
    at process.processTicksAndRejections (node:internal/process/task_queues:104:5)
    at async _0x3f7ad2.handleMessage (/opt/LM-Studio/resources/app/.webpack/lib/llmworker.js:1:554788) {
  cause: 'Failed to load model',
  suggestion: undefined,
  errorData: undefined,
  data: undefined,
  displayData: undefined,
  title: 'Failed to load model.'
}
2026-06-16 18:58:53 [DEBUG]
 LlamaV4::load called with model path: /home/jain/.lmstudio/models/lmstudio-community/gemma-3-270m-it-qat-GGUF/gemma-3-270m-it-qat-Q4_0.gguf
LlamaV4::load config: n_parallel=4 n_ctx=8196 kv_unified=true
2026-06-16 18:58:53 [DEBUG]
 MESA-INTEL: warning: Haswell Vulkan support is incomplete
2026-06-16 18:58:53 [DEBUG]
 Applying legacy swa_full=true default for arch gemma3
0.00.141.947 I srv    load_model: loading model '/home/jain/.lmstudio/models/lmstudio-community/gemma-3-270m-it-qat-GGUF/gemma-3-270m-it-qat-Q4_0.gguf'
2026-06-16 18:58:54 [DEBUG]
 0.00.766.698 W load: control-looking token:    212 '' was not control-type; this is probably a bug in the model. its type will be overridden
2026-06-16 18:58:54 [DEBUG]
 ggml_vulkan: device Vulkan0 does not support 16-bit storage.
2026-06-16 18:58:54 [DEBUG]
 0.00.952.940 E llama_model_load: error loading model: Unsupported device
0.00.952.974 E llama_model_load_from_file_impl: failed to load model
0.00.952.976 E common_init_from_params: failed to load model '/home/jain/.lmstudio/models/lmstudio-community/gemma-3-270m-it-qat-GGUF/gemma-3-270m-it-qat-Q4_0.gguf'
0.00.952.979 E srv    load_model: failed to load model, '/home/jain/.lmstudio/models/lmstudio-community/gemma-3-270m-it-qat-GGUF/gemma-3-270m-it-qat-Q4_0.gguf': error loading model: Unsupported device
2026-06-16 18:58:54 [DEBUG]
 [LLMProcess] Failed to load model _0x3f9935 [Error]: Failed to load model.
    at _0x3f7ad2.loadModel (/opt/LM-Studio/resources/app/.webpack/lib/llmworker.js:1:562652)
    at process.processTicksAndRejections (node:internal/process/task_queues:104:5)
    at async _0x3f7ad2.handleMessage (/opt/LM-Studio/resources/app/.webpack/lib/llmworker.js:1:554788) {
  cause: 'Failed to load model',
  suggestion: undefined,
  errorData: undefined,
  data: undefined,
  displayData: undefined,
  title: 'Failed to load model.'
}
2026-06-16 18:59:22 [DEBUG]
 LlamaV4::load called with model path: /home/jain/.lmstudio/models/lmstudio-community/gemma-3-270m-it-qat-GGUF/gemma-3-270m-it-qat-Q4_0.gguf
LlamaV4::load config: n_parallel=4 n_ctx=8196 kv_unified=true
2026-06-16 18:59:22 [DEBUG]
 MESA-INTEL: warning: Haswell Vulkan support is incomplete
2026-06-16 18:59:22 [DEBUG]
 Applying legacy swa_full=true default for arch gemma3
0.00.125.143 I srv    load_model: loading model '/home/jain/.lmstudio/models/lmstudio-community/gemma-3-270m-it-qat-GGUF/gemma-3-270m-it-qat-Q4_0.gguf'
2026-06-16 18:59:23 [DEBUG]
 0.00.736.472 W load: control-looking token:    212 '' was not control-type; this is probably a bug in the model. its type will be overridden
2026-06-16 18:59:23 [DEBUG]
 ggml_vulkan: device Vulkan0 does not support 16-bit storage.
2026-06-16 18:59:23 [DEBUG]
 0.00.917.603 E llama_model_load: error loading model: Unsupported device
0.00.917.620 E llama_model_load_from_file_impl: failed to load model
0.00.917.622 E common_init_from_params: failed to load model '/home/jain/.lmstudio/models/lmstudio-community/gemma-3-270m-it-qat-GGUF/gemma-3-270m-it-qat-Q4_0.gguf'
0.00.917.624 E srv    load_model: failed to load model, '/home/jain/.lmstudio/models/lmstudio-community/gemma-3-270m-it-qat-GGUF/gemma-3-270m-it-qat-Q4_0.gguf': error loading model: Unsupported device
2026-06-16 18:59:23 [DEBUG]
 [LLMProcess] Failed to load model _0x3f9935 [Error]: Failed to load model.
    at _0x3f7ad2.loadModel (/opt/LM-Studio/resources/app/.webpack/lib/llmworker.js:1:562652)
    at process.processTicksAndRejections (node:internal/process/task_queues:104:5)
    at async _0x3f7ad2.handleMessage (/opt/LM-Studio/resources/app/.webpack/lib/llmworker.js:1:554788) {
  cause: 'Failed to load model',
  suggestion: undefined,
  errorData: undefined,
  data: undefined,
  displayData: undefined,
  title: 'Failed to load model.'
}
2026-06-16 18:59:29 [DEBUG]
 LlamaV4::load called with model path: /home/jain/.lmstudio/models/lmstudio-community/gemma-3-270m-it-qat-GGUF/gemma-3-270m-it-qat-Q4_0.gguf
LlamaV4::load config: n_parallel=4 n_ctx=8196 kv_unified=true
2026-06-16 18:59:29 [DEBUG]
 MESA-INTEL: warning: Haswell Vulkan support is incomplete
2026-06-16 18:59:29 [DEBUG]
 Applying legacy swa_full=true default for arch gemma3
2026-06-16 18:59:29 [DEBUG]
 0.00.117.496 I srv    load_model: loading model '/home/jain/.lmstudio/models/lmstudio-community/gemma-3-270m-it-qat-GGUF/gemma-3-270m-it-qat-Q4_0.gguf'
2026-06-16 18:59:30 [DEBUG]
 0.00.682.134 W load: control-looking token:    212 '' was not control-type; this is probably a bug in the model. its type will be overridden
2026-06-16 18:59:30 [DEBUG]
 ggml_vulkan: device Vulkan0 does not support 16-bit storage.
2026-06-16 18:59:30 [DEBUG]
 0.00.855.008 E llama_model_load: error loading model: Unsupported device
0.00.855.042 E llama_model_load_from_file_impl: failed to load model
0.00.855.044 E common_init_from_params: failed to load model '/home/jain/.lmstudio/models/lmstudio-community/gemma-3-270m-it-qat-GGUF/gemma-3-270m-it-qat-Q4_0.gguf'
0.00.855.046 E srv    load_model: failed to load model, '/home/jain/.lmstudio/models/lmstudio-community/gemma-3-270m-it-qat-GGUF/gemma-3-270m-it-qat-Q4_0.gguf': error loading model: Unsupported device
2026-06-16 18:59:30 [DEBUG]
 [LLMProcess] Failed to load model _0x3f9935 [Error]: Failed to load model.
    at _0x3f7ad2.loadModel (/opt/LM-Studio/resources/app/.webpack/lib/llmworker.js:1:562652)
    at process.processTicksAndRejections (node:internal/process/task_queues:104:5)
    at async _0x3f7ad2.handleMessage (/opt/LM-Studio/resources/app/.webpack/lib/llmworker.js:1:554788) {
  cause: 'Failed to load model',
  suggestion: undefined,
  errorData: undefined,
  data: undefined,
  displayData: undefined,
  title: 'Failed to load model.'
}
2026-06-16 19:02:02 [DEBUG]
 LlamaV4::load called with model path: /home/jain/.lmstudio/models/LiquidAI/LFM2-350M-GGUF/LFM2-350M-Q4_0.gguf
LlamaV4::load config: n_parallel=4 n_ctx=8192 kv_unified=true
2026-06-16 19:02:02 [DEBUG]
 MESA-INTEL: warning: Haswell Vulkan support is incomplete
2026-06-16 19:02:02 [DEBUG]
 0.00.092.224 I srv    load_model: loading model '/home/jain/.lmstudio/models/LiquidAI/LFM2-350M-GGUF/LFM2-350M-Q4_0.gguf'
2026-06-16 19:02:02 [DEBUG]
 ggml_vulkan: device Vulkan0 does not support 16-bit storage.
2026-06-16 19:02:02 [DEBUG]
 0.00.300.094 E llama_model_load: error loading model: Unsupported device
0.00.300.121 E llama_model_load_from_file_impl: failed to load model
0.00.300.123 E common_init_from_params: failed to load model '/home/jain/.lmstudio/models/LiquidAI/LFM2-350M-GGUF/LFM2-350M-Q4_0.gguf'
0.00.300.125 E srv    load_model: failed to load model, '/home/jain/.lmstudio/models/LiquidAI/LFM2-350M-GGUF/LFM2-350M-Q4_0.gguf': error loading model: Unsupported device
2026-06-16 19:02:02 [DEBUG]
 [LLMProcess] Failed to load model _0x3f9935 [Error]: Failed to load model.
    at _0x3f7ad2.loadModel (/opt/LM-Studio/resources/app/.webpack/lib/llmworker.js:1:562652)
    at process.processTicksAndRejections (node:internal/process/task_queues:104:5)
    at async _0x3f7ad2.handleMessage (/opt/LM-Studio/resources/app/.webpack/lib/llmworker.js:1:554788) {
  cause: 'Failed to load model',
  suggestion: undefined,
  errorData: undefined,
  data: undefined,
  displayData: undefined,
  title: 'Failed to load model.'
}
2026-06-16 19:02:06 [DEBUG]
 LlamaV4::load called with model path: /home/jain/.lmstudio/models/LiquidAI/LFM2-350M-GGUF/LFM2-350M-Q4_0.gguf
LlamaV4::load config: n_parallel=4 n_ctx=8192 kv_unified=true
2026-06-16 19:02:06 [DEBUG]
 MESA-INTEL: warning: Haswell Vulkan support is incomplete
2026-06-16 19:02:06 [DEBUG]
 0.00.077.805 I srv    load_model: loading model '/home/jain/.lmstudio/models/LiquidAI/LFM2-350M-GGUF/LFM2-350M-Q4_0.gguf'
2026-06-16 19:02:06 [DEBUG]
 ggml_vulkan: device Vulkan0 does not support 16-bit storage.
2026-06-16 19:02:07 [DEBUG]
 0.00.301.853 E llama_model_load: error loading model: Unsupported device
0.00.301.872 E llama_model_load_from_file_impl: failed to load model
0.00.301.874 E common_init_from_params: failed to load model '/home/jain/.lmstudio/models/LiquidAI/LFM2-350M-GGUF/LFM2-350M-Q4_0.gguf'
0.00.301.876 E srv    load_model: failed to load model, '/home/jain/.lmstudio/models/LiquidAI/LFM2-350M-GGUF/LFM2-350M-Q4_0.gguf': error loading model: Unsupported device
2026-06-16 19:02:07 [DEBUG]
 [LLMProcess] Failed to load model _0x3f9935 [Error]: Failed to load model.
    at _0x3f7ad2.loadModel (/opt/LM-Studio/resources/app/.webpack/lib/llmworker.js:1:562652)
    at process.processTicksAndRejections (node:internal/process/task_queues:104:5)
    at async _0x3f7ad2.handleMessage (/opt/LM-Studio/resources/app/.webpack/lib/llmworker.js:1:554788) {
  cause: 'Failed to load model',
  suggestion: undefined,
  errorData: undefined,
  data: undefined,
  displayData: undefined,
  title: 'Failed to load model.'
}
2026-06-16 19:02:38 [DEBUG]
 LlamaV4::load called with model path: /home/jain/.lmstudio/models/LiquidAI/LFM2-350M-GGUF/LFM2-350M-Q4_0.gguf
LlamaV4::load config: n_parallel=4 n_ctx=8192 kv_unified=true
2026-06-16 19:02:38 [DEBUG]
 MESA-INTEL: warning: Haswell Vulkan support is incomplete
2026-06-16 19:02:38 [DEBUG]
 0.00.079.694 I srv    load_model: loading model '/home/jain/.lmstudio/models/LiquidAI/LFM2-350M-GGUF/LFM2-350M-Q4_0.gguf'
2026-06-16 19:02:38 [DEBUG]
 ggml_vulkan: device Vulkan0 does not support 16-bit storage.
2026-06-16 19:02:38 [DEBUG]
 0.00.316.444 E llama_model_load: error loading model: Unsupported device
0.00.316.476 E llama_model_load_from_file_impl: failed to load model
0.00.316.478 E common_init_from_params: failed to load model '/home/jain/.lmstudio/models/LiquidAI/LFM2-350M-GGUF/LFM2-350M-Q4_0.gguf'
0.00.316.480 E srv    load_model: failed to load model, '/home/jain/.lmstudio/models/LiquidAI/LFM2-350M-GGUF/LFM2-350M-Q4_0.gguf': error loading model: Unsupported device
2026-06-16 19:02:38 [DEBUG]
 [LLMProcess] Failed to load model _0x3f9935 [Error]: Failed to load model.
    at _0x3f7ad2.loadModel (/opt/LM-Studio/resources/app/.webpack/lib/llmworker.js:1:562652)
    at process.processTicksAndRejections (node:internal/process/task_queues:104:5)
    at async _0x3f7ad2.handleMessage (/opt/LM-Studio/resources/app/.webpack/lib/llmworker.js:1:554788) {
  cause: 'Failed to load model',
  suggestion: undefined,
  errorData: undefined,
  data: undefined,
  displayData: undefined,
  title: 'Failed to load model.'
}

See All on GenAI « Previously
Tags: Generative AI,Agentic AI,

Liquid LFM2.5-1.2B: A Tiny AI Model with Surprisingly Big Ambitions

See All on AI Model Releases « Previously

Liquid LFM2.5-1.2B: A Tiny AI Model with Surprisingly Big Ambitions

Exploring how Liquid AI's compact 1.2-billion-parameter model delivers fast, efficient, and capable AI experiences on everyday devices.

The AI industry has spent years chasing larger and larger language models, often requiring powerful servers, expensive GPUs, and significant energy consumption. Liquid AI is taking a different path with the LFM2.5-1.2B family—a compact model designed to bring advanced AI capabilities directly onto edge devices while maintaining competitive performance.

Available in Base, Instruct, and Thinking variants, the LFM2.5-1.2B models demonstrate that useful AI doesn't necessarily require tens or hundreds of billions of parameters. Instead, Liquid AI focuses on efficiency, optimized architecture design, and extensive training to maximize performance within a small footprint.

What Is LFM2.5-1.2B?

LFM2.5-1.2B is part of Liquid AI's latest generation of Liquid Foundation Models (LFMs). Built specifically for on-device deployment, the model contains approximately 1.2 billion parameters and is optimized for low memory usage, fast inference, and real-world deployment scenarios.

Model Size

1.2 Billion Parameters

Context Length

Up to 32K Tokens

Deployment

On-device, Edge, Cloud

Memory Footprint

Under 1GB in optimized deployments

The model builds upon Liquid AI's hybrid architecture approach, combining attention mechanisms with specialized convolutional components to improve speed and efficiency compared to traditional transformer-only designs.

Why Small Models Matter Again

As enterprises and developers seek lower costs, improved privacy, and reduced latency, compact models are becoming increasingly important. Rather than sending every request to a remote server, organizations can deploy lightweight AI systems directly on laptops, smartphones, embedded systems, and edge infrastructure.

      Key Idea: Instead of competing solely on parameter count,
      LFM2.5-1.2B competes on efficiency, speed, and deployment flexibility.
    

This shift is particularly valuable for industries where privacy, reliability, and offline operation are critical. Healthcare, industrial automation, customer service, field operations, and mobile applications can all benefit from running AI locally.

The Three Variants of LFM2.5-1.2B

1. LFM2.5-1.2B-Base

The Base version serves as a foundation model intended for customization, fine-tuning, and specialized applications. Developers can adapt it to domain-specific tasks without starting from scratch.

2. LFM2.5-1.2B-Instruct

The Instruct model is optimized for conversational AI and instruction following. It delivers a user-friendly chat experience while maintaining fast response times and low hardware requirements.

3. LFM2.5-1.2B-Thinking

The Thinking version is designed for reasoning-heavy tasks. It generates intermediate reasoning steps before producing answers, helping improve performance on multi-step problems, logical reasoning, planning, and complex decision-making tasks.

Performance Beyond Its Size

One of the most impressive aspects of LFM2.5-1.2B is how effectively it uses its limited parameter budget. Through expanded pretraining, reinforcement learning techniques, and architecture optimization, Liquid AI positions the model as a serious competitor to significantly larger open-source alternatives.

The company reports strong benchmark performance across reasoning, instruction following, and knowledge tasks while maintaining inference speeds suitable for edge deployment.

Built for Edge AI

Edge AI is rapidly becoming one of the most important trends in machine learning. Users increasingly expect intelligent systems to work instantly, privately, and without constant internet connectivity.

LFM2.5-1.2B was designed with these requirements in mind:

Fast CPU inference
Mobile NPU compatibility
Low memory consumption
Reduced cloud costs
Improved privacy through local execution
Support for long-context applications

This makes the model particularly attractive for AI-powered mobile apps, personal assistants, local copilots, and enterprise edge deployments.

Developer Ecosystem and Deployment Options

Another strength of the LFM2.5 ecosystem is broad tooling support. Developers can deploy the model using popular frameworks and runtimes, enabling quick experimentation and production deployment.

llama.cpp
vLLM
MLX
Ollama
Transformers
Fine-tuning frameworks such as Unsloth and TRL

This flexibility lowers adoption barriers and makes the model accessible to startups, enterprises, and independent developers alike.

Potential Use Cases

Thanks to its balance between size and capability, LFM2.5-1.2B can power a wide range of applications:

AI chat assistants
Document analysis and summarization
Retrieval-augmented generation (RAG)
Agent-based workflows
Code assistance
Knowledge management systems
Offline AI applications
Smart device integrations

The Future of Efficient AI

The AI industry is gradually realizing that bigger isn't always better. While frontier-scale models continue to push capability boundaries, compact models like LFM2.5-1.2B are making advanced AI accessible to a much broader range of devices and users.

This trend mirrors the evolution of computing itself: powerful technologies eventually become smaller, cheaper, and more widely available.

Final Thoughts

Liquid AI's LFM2.5-1.2B demonstrates how thoughtful architecture, extensive training, and deployment-focused engineering can create a highly capable language model without requiring massive computational resources.

For developers seeking fast, private, and cost-effective AI solutions, LFM2.5-1.2B represents an exciting glimpse into the future of edge AI. Rather than chasing scale alone, it proves that efficiency can be just as transformative.

See All on AI Model Releases « Previously Tags: Agentic AI,Large Language Models,AI Model Alert,

Pages

Tuesday, June 16, 2026

Attempting to run a local LLM via LM Studio

Explain these logs... (logs attached at the end of the post)

What's happening

Why every model fails

The important clues

Clue 1

Clue 2

The strange token warning

Most likely setup

How to verify

Fixes

Option 1: Disable GPU acceleration (recommended)

Option 2: Update Mesa

Option 3: Use OpenCL/CPU backend

Option 4: Use a newer GPU

Bottom line

(base) jain@jain-ThinkPad-T440s:~$ lspci | grep VGA 00:02.0 VGA compatible controller: Intel Corporation Haswell-ULT Integrated Graphics Controller (rev 0b)

Why LM Studio fails

What you should do

Disable GPU acceleration completely

Can the T440s still run local models?

Verify LM Studio is using CPU

If LM Studio still insists on Vulkan

Complete Logs

Liquid LFM2.5-1.2B: A Tiny AI Model with Surprisingly Big Ambitions

Liquid LFM2.5-1.2B: A Tiny AI Model with Surprisingly Big Ambitions

What Is LFM2.5-1.2B?

Model Size

Context Length

Deployment

Memory Footprint

Why Small Models Matter Again

The Three Variants of LFM2.5-1.2B

1. LFM2.5-1.2B-Base

2. LFM2.5-1.2B-Instruct

3. LFM2.5-1.2B-Thinking

Performance Beyond Its Size

Built for Edge AI

Developer Ecosystem and Deployment Options

Potential Use Cases

The Future of Efficient AI

Final Thoughts

(base) jain@jain-ThinkPad-T440s:~$ lspci | grep VGA
00:02.0 VGA compatible controller: Intel Corporation Haswell-ULT Integrated Graphics Controller (rev 0b)