Memory-Mapped I/O vs Standard File I/O

February 20, 2025

When dealing with large files, efficient file handling can make a significant difference in performance. There are two common approaches: memory-mapped I/O (mmap) and standard file I/O (std). In this blog post, I’ll explore a Rust program that benchmarks both methods, analyzing their strengths and weaknesses.

Image Embedding Inference - Serving Open Source Image Embeddings Model

January 16, 2025

Image embedding is a powerful tool in machine learning that allows us to convert images into numerical vectors, making it easier to analyze, compare, and use them in downstream tasks like classification, clustering, or recommendation systems. This post goes over the Image Embedding Inference (IEI) project, which provides a REST API for generating embeddings using pretrained models from Hugging Face. This project is part of a larger one I’m building which focuses on image retrieval and multi-modal knowledge building.

LLMs vs Rerankers for Intent Routing

December 26, 2024

Intent routing is essential for many Generative AI (GenAI) applications, enabling systems to accurately interpret user queries and route them to the appropriate actions. With the rise of Large Language Models (LLMs), their flexibility and contextual understanding have made them a go-to choice for intent classification tasks. However, embedding-based rerankers offer a compelling alternative, delivering high accuracy with significantly lower computational costs and latency.

This analysis compares the performance of a reranker model (BAAI/bge-reranker-large) versus a state-of-the-art LLM (claude-3-haiku-20240307) for intent routing, focusing on their trade-offs in efficiency, accuracy, and suitability for real-world use cases.

Serving Deep Learning Models with Rust (and comparing it to Python)

December 10, 2024

Building a service to serve machine learning models, such as a BERT-based embedding generator, requires careful consideration of factors like performance, ease of development, and maintainability. This article explores two implementations of such a service—one in Rust and the other in Python—and highlights their design choices, strengths, and trade-offs.

To explore the efficiency of serving machine learning models, I conducted a benchmark comparison between Python and Rust implementations. My initial hypothesis was that Rust, with its reputation for high performance and low-level control, would significantly outperform Python.

Additionally, I aimed to investigate how different concurrency mechanisms—such as RwLock and Mutex — and the choice between sharing or not sharing the model state among workers would influence performance.

The Actor Model in Rust

November 24, 2024

In this blog we tinker around the actor model in rust. It’s a very interesting exercise given Rust’s unique features. Rust’s strengths in memory safety and concurrency make it a great choice for building robust, concurrent systems. In this post, we’ll explore a program that implements an actor model in Rust using the asynchronous runtime Tokio. This example illustrates message-passing, state management, and graceful shutdown in a concurrent environment.

Implementing a Singleton Pattern in Rust: A Practical Example

November 16, 2024

In this blog post, we’ll explore how to implement a Singleton pattern in Rust. The Singleton design pattern ensures a class has only one instance while providing a global access point to that instance. Rust’s ownership model and thread safety make this implementation an interesting challenge.

Asynchronous Programming and the Actor Model in Golang

November 18, 2023

This blog post provides insights into Golang’s concurrent programming features. It delves into the implementation of actors, independent entities communicating through messages. We do a walkthrough over a simple actor system implementation in go, which showcases actor creation, message sending, and concurrent message processing, highlighting the principles of the actor model.

A Keras-like Deep Learning Crate in Rust Powered by Candle

November 04, 2023

While candle provides a low-level API for building neural network models, some users (including myself) may prefer a more intuitive and user-friendly way to define, compile, and train their models. That’s why I propose the addition of a high-level Keras-like API to candle. This API would allow users to define models in a sequential manner by adding layers one after the other. It would also provide methods for model compilation, training, and evaluation.

Mojo - An Introduction

October 22, 2023

A first look at Mojo. In this post I scratch the surface of Mojo’s syntax and compare how its borrow semantics compare to rust’s.

Using Actix to expose a command line tool

October 13, 2023

In this blog post, we’ll explore a Rust program that exposes an existing command-line tool via a REST API. This program leverages the Actix-web framework to create a simple HTTP server, handles HTTP requests, and interacts with an external CLI tool.

Understanding Ord, Eq, PartialOrd, and PartialEq Traits in Rust

May 30, 2023

Rust provides several traits that are fundamental for comparing and ordering values. These traits include Ord, Eq, PartialOrd, and PartialEq. In this blog post, we’ll explore what these traits are, when they can and can’t be derived, and how they can be used to sort a vector.

How to process and download financial data in rust

February 17, 2023

This is the first blog post of a series that will focus on time series analysis and forecasting of stock prices. This post shows how to get data from yahoo finance and parse json responses in rust.

Rust - Dynamic dispatching and Generics

January 28, 2023

In this blog post I show the main differences between using dynamic dispatching, dyn, and generics,impl/<T>, using a simple program that creates different database connectors.

NB: this blog post assumes some familiarity with rust