Python on Benedykt Huszcza | Blog

Engineering Thesis – Project Developed with OLX

Mon, 24 Feb 2025 02:00:00 +0000

Our Engineering Project – Is This the Future of Marketing?

This project was developed by a four-person team as our engineering thesis. It was a huge challenge for us, but at the same time, an incredible adventure that allowed us to combine creativity with modern technology.

As I write this post, I feel proud of what we’ve accomplished. On the other hand, there are some limitations, which I’ll explain in a moment. But first, let’s talk about what we managed to create and why we believe it could be one of the future directions for marketing.

What is this engineering project about?

In short – together with OLX, we set out to create a tool that allows companies to quickly and easily generate advertising banners without the need to hire a team of graphic designers. This is made possible by using artificial intelligence algorithms that automate the design process.

Just think about it – how many times have you seen repetitive, boring ads that passed by unnoticed? Or how many companies give up on advertising campaigns due to the cost of hiring professional designers? These are exactly the problems we aimed to solve by creating a banner generator platform.

How does it work?

Without getting too technical – the system uses generative AI solutions to create unique and visually appealing banners based on user preferences. The user can choose from three different types of banners, add their logo, set dominant colors, and even generate a catchy slogan. The whole process takes just a few minutes!

Why can’t I reveal everything… yet

And now we get to the most interesting (and somewhat frustrating) part. At this stage, I can’t go into technical details or show exactly how the system works. Why? Simply because the copyright situation is still unclear. The project has sparked interest and… that’s all I can say for now, as I don’t want to jinx anything.

What’s next?

This post isn’t overly detailed, but I plan to expand it in the future with technical insights. I’d like to explain how we managed to integrate AI with a seemingly creative process like graphic design. For now, I need to be patient and wait to see how things develop regarding the potential acquisition of the project.

One thing is certain – this engineering project was a challenge but also an opportunity for growth and learning new tools. Regardless of what the future holds, I am already proud of what we have achieved. And if everything goes as planned… who knows, maybe one day you’ll see the results of our work on OLX banners?

Vessel Extraction – Image Processing Using Python and OpenCV

Sun, 23 Feb 2025 18:00:00 +0000

Introduction

This project was developed as part of the course Medical Informatics. Except for a hackathon (by the way, I highly recommend reading Maciej’s post about this event), this was my first serious encounter with libraries such as PyTorch and OpenCV.

Since the task turned out to be quite challenging, it forced me to dive deep into research on various image processing methods. I explored literally everything – from the simplest filters to more advanced computer vision techniques. As a result, I learned the fundamental methods used in this field, significantly broadening my knowledge.

I won’t lie – it was tough at times, especially when noise in the images ruined hours of coding. Nevertheless, the vision of using technology to analyze medical images inspired me and kept me going. There were moments of frustration when things didn’t work as expected, but the satisfaction of a working solution definitely made up for all the struggles.

K-Nearest Neighbors (KNN) – Blood Vessel Classification

What is K-Nearest Neighbors?

K-Nearest Neighbors (KNN) is one of the simplest and most intuitive machine learning algorithms. It operates on the assumption that similar data points are close to each other in feature space. In short – if you want to know the class of a new point, check the class of its nearest neighbors.

In the case of Vessel Extraction, KNN was used for:

Classifying pixels as “blood vessels” or “background,”
Analyzing pixel neighborhoods to better distinguish vessels from noise.

Undersampling – How Did I Deal with Imbalanced Data?

What Were the Challenges?

Overwhelming amount of background data – Areas without blood vessels (background) dominated the images, causing the model to learn to recognize mainly the background and not the vessels.
Underrepresentation of blood vessels – Pixels belonging to blood vessels made up less than 10% of all data, leading to model overfitting.

How Did I Handle It?

I decided to use undersampling – intentionally reducing the number of background samples so that the number of vessel and background pixels was more balanced. Sounds simple, but it required a few thoughtful steps:

Selecting Background Samples:
- I didn’t randomly discard background data, as this could lead to a loss of important contextual information.
- I focused on representative samples, specifically those located near blood vessels. This gave the model better learning context.
Reducing Background Samples:
- I ultimately reduced the number of background samples by about 70%, resulting in a more balanced ratio of vessel to background data.
- It was crucial not to overdo it – I had to leave enough background to prevent the model from confusing it with vessels.
Preserving Local Patterns:
- By using 3x3 pixel patches, the model retained local patterns, which improved accuracy.

How Did KNN Work in This Project?

Feature Extraction:
- Each pixel was described by its brightness value and the values of neighboring pixels.
- This provided the model with more information about the local context.
Choosing the Number of Neighbors (k):
- The key parameter in KNN is k – the number of nearest neighbors whose class is considered for classification.
- I conducted cross-validation to find the optimal value for k.
- The best results were achieved with k = 5, ensuring a balance between accuracy and recall.
Classification:
- For each pixel, the classes of its k nearest neighbors were checked.
- The pixel was assigned to the class with the most representatives in its neighborhood.

Results and Performance

Accuracy: 89% – pretty good for a simple model without deep learning!
Recall: 85% – effectively detected blood vessels but sometimes confused them with thin background lines.
Precision: 91% – the model successfully avoided false positives (mistaking the background for vessels).

I know, at first glance, the results don’t look impressive, and it’s hard to believe I got such good “numbers” (i.e., accuracy). But here’s the trick – it’s all about the chosen approach.

I used 3x3 pixel patches because smaller fragments make it easier for the model to detect local patterns characteristic of blood vessels. The total image size was 512x512 pixels, so if the classifier recognized a 3x3 patch as a vessel, all 9 pixels in that patch were completely filled in white.

This approach meant the model was more confident in its decisions, which positively impacted accuracy and Dice score.

FastAi – Deep Learning for Blood Vessel Classification

Why FastAi?

After testing the classic KNN approach, I decided to take it up a notch and use FastAi – a framework built on PyTorch that is excellent for rapid prototyping of deep learning models. FastAi provides:

Easy integration with pre-trained models (e.g., ResNet),
A simple API that speeds up data preparation and model training,
Advanced optimization techniques (e.g., learning rate finder).

How Did FastAi Work in This Project?

Data Preparation:
- Images were divided into smaller patches to help the models learn patterns more effectively.
- I used FastAi DataBlock API for efficient data management and labeling.
- Classification was performed on two levels:
  - Blood vessels,
  - Background.
Deep Learning Model:
- I chose ResNet34 – lightweight but powerful enough for vessel recognition.
- I used transfer learning with pre-trained weights (ImageNet), which sped up training.
- Fine-tuning the last layers helped tailor the model to the specific task of vessel recognition.

Results and Performance

Accuracy: 92% – a clear improvement compared to KNN.
Recall: 90% – the model effectively recognized vessels, even in challenging cases.
Precision: 94% – very few false positives, resulting in highly accurate vessel detection.

First Step into Machine Learning

This was my first individual project in machine learning and computer vision, and it was an incredible learning experience. I understood how powerful image processing techniques are and how to handle imbalanced data with undersampling. Although there were many challenges and frustrations, I got hooked on machine learning. Experimenting with data, testing models, and optimizing algorithms turned out to be truly exciting.

I realize that I have a lot more to learn – from advanced neural network architectures to GPU optimization – but I’m excited to continue this journey. If you want to check out the source code or learn more, visit the repository on GitHub. Who knows, maybe this project will inspire you to start your own adventure with AI?

SpeedDatingMatcher – Event Management with Django and Next.js

Sun, 23 Feb 2025 17:00:00 +0000

Where Did the Idea Come From?

I am lucky to have my sister Róża, who studies at the Medical University of Białystok. Moreover, not only does she study there, but she also runs the university magazine – Młody Medyk. For about two years, the student organization under Róża’s leadership has been organizing speed dating events. And this is exactly where the problem arose, which the future Doctor brought to me. As a mature developer, I decided to roll up my sleeves and solve it.

What is SpeedDatingMatcher?

SpeedDatingMatcher is an event management system specifically designed for adding participants, their preferences (willingness to contact after the event), and sending emails with contact information to selected people. The application handles email communication through integration with Brevo. The first edition was deployed on Microsoft Azure, but honestly, I wasn’t fully aware of all the SSR properties, and the whole endeavor ended with considerable frustration with Azure, although I eventually managed to deploy the system on it.

In this year’s edition of the application (2025), I opted for a more civilized approach: VPS with Docker and nginx. I also had access to a Free Tier AWS EC2 server, but since I was planning to create this blog in the near future, I decided to purchase server access right away.

Features

The main task of the application was not only sending emails but also automatically matching people participating in the event. That is: if a person with number 3 wanted to contact numbers 2 and 1, then before sending the email, I needed to check if both person 2 and 1 also wanted to contact number 3. Only if there is a mutual preference, the email is sent. This situation is illustrated by the following graphic:

It is clearly visible that the email is sent only when a cycle of length 2 is found in the graph.

Ultimately, the application enables:

Managing speed dating events – Comprehensive management of participants and meetings.
Email validation – Ensuring that all provided email addresses are correct.
Integration with Brevo – Automatic sending of notifications and reminders to participants.

Additionally, thanks to Django, it’s easy to create and manage users.

Tech Stack

Django – Backend and database management.
Next.js – Fast frontend with server-side rendering capabilities.
Brevo – Integration for sending emails.
BeautifulSoup4 – Dynamic modification of HTML email templates.
Microsoft Azure – Cloud hosting for the first version.
Docker – Containerization of solutions.
Nginx – Reverse proxy on the VPS server.

Summary

Honestly, this was my first serious project realized outside of work. I would be lying if I said that planning the architecture, selecting technologies, and writing the solution didn’t give me immense joy and didn’t awaken my developer’s soul.

Thanks to this project, I had the opportunity to test my ideas and also get familiar with SSR, Django, and Azure.

I co-created the project with Maciek, which provided an opportunity to manage tasks, divide them, and conduct mutual Code Review.

Want to learn more? Check out the repository on GitHub.