Bernardo de Lemos

Image Embedding Inference - Serving Open Source Image Embeddings Model

Image embedding is a powerful tool in machine learning that allows us to convert images into numerical vectors, making it easier to analyze, compare, and use them in downstream tasks like classification, clustering, or recommendation systems. This post goes over the Image Embedding Inference (IEI) project, which provides a REST API for generating embeddings using pretrained models from Hugging Face. This project is part of a larger one I’m building which focuses on image retrieval and multi-modal knowledge building.


read more

LLMs vs Rerankers for Intent Routing

Intent routing is essential for many Generative AI (GenAI) applications, enabling systems to accurately interpret user queries and route them to the appropriate actions. With the rise of Large Language Models (LLMs), their flexibility and contextual understanding have made them a go-to choice for intent classification tasks. However, embedding-based rerankers offer a compelling alternative, delivering high accuracy with significantly lower computational costs and latency.

This analysis compares the performance of a reranker model (BAAI/bge-reranker-large) versus a state-of-the-art LLM (claude-3-haiku-20240307) for intent routing, focusing on their trade-offs in efficiency, accuracy, and suitability for real-world use cases.


read more

Serving Deep Learning Models with Rust (and comparing it to Python)

Building a service to serve machine learning models, such as a BERT-based embedding generator, requires careful consideration of factors like performance, ease of development, and maintainability. This article explores two implementations of such a service—one in Rust and the other in Python—and highlights their design choices, strengths, and trade-offs.

To explore the efficiency of serving machine learning models, I conducted a benchmark comparison between Python and Rust implementations. My initial hypothesis was that Rust, with its reputation for high performance and low-level control, would significantly outperform Python.

Additionally, I aimed to investigate how different concurrency mechanisms—such as RwLock and Mutex — and the choice between sharing or not sharing the model state among workers would influence performance.


read more

The Actor Model in Rust

In this blog we tinker around the actor model in rust. It’s a very interesting exercise given Rust’s unique features. Rust’s strengths in memory safety and concurrency make it a great choice for building robust, concurrent systems. In this post, we’ll explore a program that implements an actor model in Rust using the asynchronous runtime Tokio. This example illustrates message-passing, state management, and graceful shutdown in a concurrent environment.


read more

Implementing a Singleton Pattern in Rust: A Practical Example

In this blog post, we’ll explore how to implement a Singleton pattern in Rust. The Singleton design pattern ensures a class has only one instance while providing a global access point to that instance. Rust’s ownership model and thread safety make this implementation an interesting challenge.


read more

Asynchronous Programming and the Actor Model in Golang

This blog post provides insights into Golang’s concurrent programming features. It delves into the implementation of actors, independent entities communicating through messages. We do a walkthrough over a simple actor system implementation in go, which showcases actor creation, message sending, and concurrent message processing, highlighting the principles of the actor model.


read more

A Keras-like Deep Learning Crate in Rust Powered by Candle

While candle provides a low-level API for building neural network models, some users (including myself) may prefer a more intuitive and user-friendly way to define, compile, and train their models. That’s why I propose the addition of a high-level Keras-like API to candle. This API would allow users to define models in a sequential manner by adding layers one after the other. It would also provide methods for model compilation, training, and evaluation.


read more

Mojo - An Introduction

A first look at Mojo. In this post I scratch the surface of Mojo’s syntax and compare how its borrow semantics compare to rust’s.


read more

Using Actix to expose a command line tool

In this blog post, we’ll explore a Rust program that exposes an existing command-line tool via a REST API. This program leverages the Actix-web framework to create a simple HTTP server, handles HTTP requests, and interacts with an external CLI tool.


read more

Understanding Ord, Eq, PartialOrd, and PartialEq Traits in Rust

Rust provides several traits that are fundamental for comparing and ordering values. These traits include Ord, Eq, PartialOrd, and PartialEq. In this blog post, we’ll explore what these traits are, when they can and can’t be derived, and how they can be used to sort a vector.


read more

How to process and download financial data in rust

This is the first blog post of a series that will focus on time series analysis and forecasting of stock prices. This post shows how to get data from yahoo finance and parse json responses in rust.


read more

Rust - Dynamic dispatching and Generics

In this blog post I show the main differences between using dynamic dispatching, dyn, and generics,impl/<T>, using a simple program that creates different database connectors.

NB: this blog post assumes some familiarity with rust


read more