Week #1 - The Dataset - BASIS Phoenix

Week #1 — The Dataset

Sachin C - February 13, 2025 11:26 pm

This week, I began the process of gathering data to train the artificial intelligence model I intend to use on the microcontrollers. AI models scale in quality based on the data they are trained on, so gathering a satisfactory sample of training images is quite important for an unbiased experiment.

Curating Existing Datasets: I am sourcing labeled image datasets from platforms like Kaggle, ImageNet, and OpenAI’s datasets.
- These sources cover high-quality training images that are labeled and tagged for my own convenience.
Collecting Custom Data: Using tools like OpenCV for image processing and Python scripts to automate data collection, I am capturing real-world images that closely resemble the use case of my model.
- Using these scripts ensures that my data is relatively standardized.
Data Preprocessing: I am using libraries like Pandas and NumPy to clean, filter, and structure the dataset.
- Structuring the dataset is important so that I can minimize noise, or irrelevant/random data points that disrupt underlying patterns and add meaningless information.
Augmenting Data: Since deep learning models perform better with diverse datasets, I am applying image augmentation techniques using TensorFlow and OpenCV.
- These techniques ensure that the model can recognize images with a slight variance (eg. a darker background, or an image rotated 90 degrees).

The goal is to compile a well-structured dataset that balances efficiency and accuracy while being lightweight enough to function on a microcontroller. Over the next few weeks, I’ll continue refining my dataset before moving on to model training and optimization.

I am very excited to check back in next week to update you all on my progress!

Comments:

All viewpoints are welcome but profane, threatening, disrespectful, or harassing comments will not be tolerated and are subject to moderation up to, and including, full deletion.

camille_bennett

Hi Sachin, sounds interesting. Can you explain what a microcontroller is?!

February 18, 2025 at 12:09 pm - Reply

ryan_s

Hello my distinguished gentleman. What does your data sample look like? Do you have specific training images you're looking for?

February 21, 2025 at 1:50 pm - Reply

Week #1 — The Dataset

More Posts

Comments:

Leave a Reply Cancel reply