Samsung On-Device AI Patent Applications: Into the Memory

Among 81 patent applications published in the week of May 12, 2026, a Samsung cluster keeps returning to one idea: do the AI computation inside or beside the memory, and shrink the models that run on the device — a forward-looking signal about where the company is taking on-device intelligence.

Follow the cluster, not the single filing. Samsung Electronics published 81 U.S. patent applications in the week of May 12 to May 18, 2026, and scattered through the semiconductor-process and storage-controller filings is a coherent run of applications about one thing: moving artificial-intelligence computation closer to the memory it reads from, and compressing the models so they can run on the device rather than in a data center. For a company whose hardware spans the memory chips, the mobile processors, and the phones that use both, the direction is worth reading.

The clearest expression is processing-in-memory. US20260133901A1, "Method and apparatus with processing-in-memory computation address generation," describes generating a memory request tied to a processing-in-memory operation and then "designating, by the memory controller, a memory address through bank shuffling, the bank shuffling mapping the PIM computation address to a set of addresses shuffled to a preset target bank." A companion application, US20260133899A1, covers generating the processing-in-memory request itself from a command table. Processing-in-memory is the idea of running computation where the data already sits, rather than shuttling it back and forth to a processor — and two filings in one week describe the request-and-address machinery to do it.

designating, by the memory controller, a memory address through bank shuffling, the bank shuffling mapping the PIM computation address to a set of addresses shuffled to a preset target bank.— Method and apparatus with processing-in-memory computation address generation, US20260133901A1

Around that core sit the supporting pieces. US20260133907A1 describes a memory device that runs cache-hit determination and data lookup in parallel inside a memory processing unit — pre-read acceleration that keeps the compute unit fed. US20260133837A1 covers a reconfigurable accelerator that allocates processing elements to a vector lane and releases the allocation around loop operations, a flexibility mechanism for an AI-compute block. These are filings about the plumbing of running models efficiently in hardware.

Shrinking the model to fit the device

The second half of the cluster is about the models themselves. US20260134290A1, "Method and apparatus with model generation," describes generating a smaller second neural network from a set of first networks "using knowledge distillation," with a distillation loss derived from two kinds of uncertainty. Knowledge distillation is the standard technique for compressing a large model into a small one that can run on a phone or an appliance. US20260134067A1 adds a multimodal-model fingerprinting method, and US20260136091A1 covers on-device point-cloud generation that uses an AI model to flag bad capture points and guide a user to re-photograph an object — an AI task running on the device itself.

The same on-device-AI thread runs into vehicles. US20260134253A1, "Device and method with autonomous driving using artificial intelligence," describes an AI model with a first network that routes among several second networks tied to specific driving skills, and US20260131821A1 covers extracting a bird's-eye-view feature for autonomous driving from a diffusion-model-generated scenario. Both are inference tasks meant to run on the vehicle.

The cluster has a security and integrity edge as well, which is consistent with shipping models onto devices that leave the manufacturer's control. US20260134103A1, "Storage package," describes a memory controller that downloads firmware carrying a first signature, verifies it with a paired public key, then writes a second signature derived from a password-decrypted secret key — a two-signature chain for trusting code loaded onto a storage device. The multimodal-fingerprinting application, US20260134067A1, sits in the same territory: a method for marking a trained model so its provenance can be checked later. When a company is moving valuable models out onto phones, appliances, and cars, the questions of whose code is running and which model is which become engineering problems, and the week's filings show Samsung filing into both.

A word of caution belongs here. These are published applications, not granted patents: each describes a claimed approach that has not yet been examined to issuance, and the eighteen-month publication lag means the work itself predates the publication date. A single week's applications also cannot be read as the whole of Samsung's AI program — the same 81-filing batch includes a large body of conventional semiconductor-device and storage-controller applications unrelated to the AI thesis, and the company files across many directions at once. What the cluster supports is a narrower, well-grounded reading: in this window, a coherent set of Samsung's filings is directed at putting computation next to memory and shrinking models to run on the device. That is a recognizable strategic direction for a company that builds the memory, the processors, and the products that consume both, and the filings name the specific mechanisms — bank shuffling, distillation loss, reconfigurable vector lanes — by which it intends to get there.

What the direction signals

Read together, the week's applications point to a consistent R&D direction: push the AI computation into the memory and the accelerator, and compress the models so the work can happen on the device — in a phone, an appliance, or a car — rather than entirely in the cloud. The classification data tracks it, with G06N (machine-learning models) and G06F 12 (memory-architecture) subclasses recurring across the cluster. For a business reader, the signal is about positioning: Samsung sits at the intersection of the memory it manufactures and the devices that run models on it, and these filings indicate the company is investing in making those two ends meet — computation and memory in the same place, with the model small enough to live there. As published applications, they describe direction rather than shipped product, but the direction is unusually consistent for a single week.

Samsung's New Applications Push AI Down Into the Memory and the Silicon

Shrinking the model to fit the device

What the direction signals

Comments