Nemotron

Nemotron: Advancing Tool-Calling LLMs with Rule-Based Reinforcement Learning 🤖 Large Language Models (LLMs) are becoming increasingly powerful, and their ability to interact with external tools and APIs significantly expands their capabilities. Nvidia’s Nemotron paper introduces an innovative approach to training LLMs for more effective tool use, focusing on a rule-based reinforcement learning (RL) pipeline. This method aims to overcome the common hurdle of requiring large, meticulously curated datasets, allowing models to learn optimal tool-calling strategies more autonomously. ...

June 5, 2025 · 7 min · 1393 words · Julien Seveno

Flash Attention

Introduction Transformers have revolutionized the field of machine learning, emerging as the dominant architectural choice across various applications. However, their reliance on self-attention mechanisms introduces significant computational challenges, particularly due to quadratic time and memory complexity relative to sequence length. While approximate solutions exist, their limited adoption stems from an overemphasis on theoretical FLOP counts rather than practical performance metrics. In 2022, a paper introduced a way to compute the attention result by only working on sub vectors to reduce memory I/O. ...

May 18, 2025 · 6 min · 1131 words · Julien Seveno

Webformers

Introduction Extracting structured information from web pages remains a challenging task in natural language processing. Regular transformers architecture are not designed to encode hierarchical information. Each token is connected to every tokens in the input sequence, regardless of their position (even though there are a few mechanisms to introduce positional information, such as positional encoding). In a webpage, information is highly structured. The HTML represents a tree, with each node having a parent and potential siblings and children. This makes that some nodes might be semantically connected while relatively far away from each other if we consider only the number of tokens between them. ...

April 16, 2025 · 8 min · 1492 words · Julien Seveno

Kolmogorov AI Framework | Part2 - brainfuck

Motivations I recently started a series of article about a framework I was thinking about, which allows agent to be trained in a reinforcement learning basis, executing one action at a time on specific environments. The first part of the series can be found here: Kolmogorov AI Framework. Environment To build this proof of concept, I chose the brainfuck programming language. It is part of the esoteric programming languages family, but it is made of only 6 instructions, making it very easy to start with as an execution environment. ...

April 2, 2025 · 7 min · 1416 words · Julien Seveno

EPITA Courses - Continuous Physics Informed Neural Networks

Introduction Neural networks require large amounts of data to converge. Those data need to represent the task the neural network is trying to learn. Data collection is a tedious process, especially when collecting data can be difficult or expensive. In science, physics for instance, many phenomenon are described using theories that we know are working very well. Using those data as regularization can help neural networks generalize better with less data. ...

March 16, 2025 · 8 min · 1673 words · Julien Seveno

EPITA Courses - Transformers

Context Generating data is now a hot topic in machine learning. The idea of using statistical methods to produce synthetic data is rather old. Many methods are proven to be effective in different scenarios. Today, the most well-known ways to generate synthetic data are: VAE GAN Transformers Transformers A bit of history We talked about RNN last week and we saw how they can be used to predict sequences. Unfortunately, RNN suffer some problems, especially with long sequences where they seem to forget what happened. ...

March 16, 2025 · 10 min · 1929 words · Julien Seveno

EPITA Courses - Recurrent Neural Networks

Introduction Motivation Why is it necessary to introduce the concept of recurrence in neural networks? We know certain things in sequence. Let’s take the example of the alphabet. We can recite it without even thinking about it in the order from A to Z. However, if we were asked to recite it backwards, or even worse, to recite the alphabet based on the position of the letters (give the 17th letter, then the 5th…), we would be unable to do so. ...

March 16, 2025 · 8 min · 1702 words · Julien Seveno

EPITA Courses - Timeseries

Time Series Time series & stochastic processes A time series is a set of observations $x_t$ generated sequentially over time $t$. There are two main types of time series: continuous time series discrete time series We can also differentiate time series whose values can be determined by mathematical functions, deterministic time series, from the time series that have some random component, non-deterministic time series. To forecast non-deterministic time series, we assume that there is a probability model that generates the observations of the time series. ...

March 16, 2025 · 20 min · 4164 words · Julien Seveno

About Me

I am Julien, french engineer in AI since 2017. I have worked in several organizations on different topics, all related to AI or where AI is applied to. I love playing golf (even though I am not very good at it…), going to the gym, reading, learning new things and working (not a joke, I do love it). I sometimes write articles about tech, most of them are AI related: ...

March 16, 2025 · 1 min · 188 words · Julien Seveno

MLX DQN

Reinforcement Learning with Apple MLX Framework Today, a very short article about Apple MLX framework. I recently learned that Apple has its own machine learning framework, and as a heavy Mac user I thought I’d give it a try. It is very easy to use and intuitive, the syntax is nice and looks like Numpy and PyTorch, which is convenient as a PyTorch user. As an example, let me present a Deep Q Learning implementation that I wrote. It comes from a nice DeepMind paper. ...

March 16, 2025 · 5 min · 1064 words · Julien Seveno