Modernizing a CUDA library from 2015

January 10, 2026 by Samuel Vaiter

In 2022, we released (led by Thomas Moreau) benchopt, a benchmarking suite tailored for machine learning workflows. Among the initial benchmarks, there was one dedicated to 2D Total Variation. Total Variation is a regularization introduced in 1992 by Leonid Rudin, Stanley Osher and Emad Fatemi that was (is) quite popular in inverse problems for imaging, before the use of deep priors.

I opened an issue to add one of my solvers, that I implemented in 2015. It was a CUDA library that I developed when I was a postdoc with Antonin Chambolle in 2014-2015. It was designed to be a fast implementation of a Total Variation solver based on our joint paper with Pauline Tan and Antonin. But I never took the time to make it easily pip/conda/uv-friendly, and the idea kinda died due to the usual (pretext of) lack of time.

Fast-forward to 2026, agents are quite powerful to manipulate existing codebases, especially simple ones like ftvp. I tried today amp using Claude Opus 4.5 with the following prompt.

My goal is to modernize this repository. ftvp is a CUDA library dedicated to the computation of the proximal operator of the isotropic Total Variation in 2D and 3D on Nvidia GPU. I want to drop the C-first aspect, the goal is to package a Python algorithm. It should use modern CUDA as of 2026. Do not change the algorithm, but feel free to change everything. The use is described in README.md. A reference implementation in C is in TV4colorCPTV.c. The other files are the CUDA implementation. Ultimately I want a PyTorch-aware function proxtv(img, lambda) that works on BW and color images where img is the torch array and lamba the regularization parameter of TV.

Following this prompt, Amp used $1.20 to generate this commit. It forgot to add ninja as a requirement, and the examples were only trying the prox on a black square. So I asked,

You need to add ninja to the requirements to be able to JIT.

Write me an additional examples that perform the denoising on a real image

The generated script was correct, but again it missed two dependencies required to run the examples.

Add numpy and pillow as dependency for running this example.

Finally, I generated the PR request as follow

Generate a summary of the changes so that I can copy it in the PR interface of githb. Be explicit on the fact that is was driven by a AI agent.

Overall, the total cost was $2.15 (of free credits) for something that I am convinced I would never have done otherwise. The full thread is here. I still have to check that this is correct to merge it, and then submit a pull request on benchopt/benchmark_tv_2d.

I will update this feedback note when this is done.

tags: agent optimization