The demand for applications deploying deep learning models in resource-constrained environments is on the rise today, driven by the need for low latency and real-time inference. However, many deep learning models are too large and complex to perform effectively on such devices, posing challenges for scaling and model deployment.
Therefore, striking a compromise between maintaining high accuracy and reducing inference time and model size becomes essential. In the study presented in this white paper, three different models—Custom, VGG16[2], MobileNET[3]—are compressed using tiny machine learning or TinyML, a framework for model optimization and compression. The primary goal is to preserve optimal accuracy while significantly reducing inference time and model size.
The study will assess the trade-offs between accuracy, size reduction, and inference time by comparing the compressed models by tailoring and comparing the performance with the original models. Additionally, the study intends to explore TinyML's potential to enhance user experience and enable edge computing in medical applications.
In recent years, the deployment of deep learning models on resource-constrained edge devices, such as smartphones, wearable devices, IoT devices, edge servers, and embedded systems has increased exponentially, posing challenges due to their limited computational power and memory space.
The study presented here, aimed to employ TinyML (tiny machine learning) techniques for compressing Custom, VGG16, and MobileNet models across datasets taken from the fashion, radiology, and dermatology fields. It prioritizes the following three features: achieving optimal accuracy, reduced inference time, and minimized model size suitable for deployment on resource-constrained edge devices.
The main compression techniques applied here are quantization and pruning. Quantization makes numbers in the model much less precise, while pruning selectively eliminates redundant or unnecessary connections within the neural network architecture without compromising the model too much. This helps in reducing the computational cost and memory usage for optimal and quick deployment and is adjusted to the requirements of TinyML.
For our study, which involved meticulous testing of different permutations and combinations on the datasets, our focus was to tweak the models and parameters to increase accuracy, inference time, and model size. Further, this endeavor leverages quantization techniques from TensorFlow Lite compression to investigate how TinyML might facilitate edge computing and improve user experience. It makes it easier to implement effective and lightweight models, allowing for real-time inference on edge devices with limited resources.
Tiny machine learning or TinyML is a decentralized technique that allows us to deploy machine learning models and algorithms on extremely low-power and small-footprint devices such as microcontrollers and IoT devices. With TinyML, microcontroller-based embedded devices can quickly respond in real time to machine-learning tasks.
Its applications span multiple disciplines of technology such as hardware, algorithms, and software capable of processing on-device sensor data.
The most widely adopted ecosystem for TinyML development is TensorFlow Lite. TensorFlow Lite provides a Python-based environment with an extensive collection of pre-installed libraries and toolkits that offer developers ways to effectively implement machine learning models
The general process followed for deployment on edge devices is shown below:
Model Compression for Efficient Deployment
We focused on the following elements for this study:
The principal objective was to create an accelerator methodology for deep learning model compression that is tailored for TinyML edge device deployment. Initially, we trained SSD-MobileNet and EfficientDet models on a custom dataset using the TensorFlow 2 Object Detection API. These models were then transformed into the TensorFlow Lite format using TinyML compression techniques to ensure their integration with edge devices while keeping the three primary metrics of good accuracy, low inference time, and reduced model size.
Overall, our solution offers a comprehensive and effective approach to deploying compressed deep learning models optimized for TinyML on edge devices, addressing the growing demand for efficient and lightweight machine learning solutions in various industry sectors.
Business Benefits/ Best Practices
The global edge computing market size was valued at $16.45 billion in 2023 and is expected to grow at a CAGR of 37.9% from 2023 to 2030, so optimizing deep learning models for edge devices emerges as a pivotal strategy. [1]
In the rapidly changing field of healthcare edge computing, the need for real-time patient data analysis, enabling remote diagnostic and treatment decisions without the need for continuous Wi-Fi connectivity, is imperative. The model needs to be scalable, customizable, and tailored to the needs of the healthcare industry. For example, wearable health monitoring systems with low-power microcontrollers can accurately detect irregularities in vital signs, promptly alerting medical personnel and improving patient outcomes.
The TinyML approach facilitates businesses with a competitive advantage by delivering substantial cost savings and performance enhancements, increasing customer satisfaction. These systems can be deployed in environments without Wi-Fi networks and in remote or inaccessible locations, seamlessly executing required tasks.
Data Set Optimization Domains: Fashion, Radiology, and Dermatology
Optimization algorithms were implemented using datasets from a variety of fields such as fashion, radiology, and dermatology.
Utilizing diverse datasets provides important insights into the effectiveness and versatility of optimization methods. Each dataset has different challenges, traits, and subtleties that call for investigating a large range of optimization strategies. Through the integration of diverse datasets, we can acquire a thorough understanding of the performance of the different optimization approaches.
To optimize models, we first executed a script on the data of the respective dataset to gain familiarity with its characteristics. Using TensorFlow Lite, quantization and pruning techniques were employed to meet our primary imperatives of good accuracy, low inference time, and reduced model size.
A. Fashion: FashionMNIST optimization
For the fashion domain, we optimized the models trained on the FashionMNIST dataset.
Fashion datasets contain images of clothing items, accessories, and fashion products.
FashionMNIST Dataset Specifications | Dataset Information |
---|---|
Dataset | Contains 70,000 grayscale images in 10 categories. The images show individual articles of clothing at low resolution (28 by 28 pixels). |
Train | 60,000 images |
Test | 10,000 images |
Classes | 0-9 |
Class names (labels) | 'T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot' |
B. Radiology: Pneumonia dataset optimization
Radiology datasets comprise medical images such as chest X-rays, CT scans, and radiology function test results. For example, machine learning models trained on radiology datasets can analyze chest X-ray images to detect abnormalities such as pneumonia, tuberculosis, or lung cancer, assisting radiologists in accurate diagnosis and treatment planning.
For this dataset, we first organized and prepared a dataset comprising the chest X-ray images of two types of patients: normal and with pneumonia. Then, we resized images to different dimensions (150x150, 1000x1000, and 224x224) to show how well models can adjust. After that, we trained models using both custom and pre-trained approaches, noting how different model setups performed.
Radio logy Dataset Specifications | Dataset Information |
---|---|
Dataset | Chest X-ray images |
Train | 5216 images (Normal: 1341, Pneumonia: 3875 |
Test | 624 images (Normal: 234, Pneumonia: 390) |
Validation | 16 images (Normal: 8, Pneumonia: 8 |
Classes | 0-1 |
Class names (labels) | 'Normal', 'Pneumonia' |
C. Dermatology: Skin lesion dataset optimization
Dermatology: Dermatology datasets typically consist of images of skin conditions, lesions, and diseases.
We began by sampling a skin lesion dataset, which included multi-source dermatoscopic images of pigmented lesions. We organized and prepared the dataset through pre-processing to ensure readiness for analysis. Like other datasets, we resized the images to 224x224, demonstrating the adaptability of the models to specific image dimensions. For training, we utilized pre-trained models such as MobileNetV2, and various configurations were assessed to gauge performance metrics.
Skin Lesion Dataset Specifications | Dataset Information |
---|---|
Dataset | Dermatoscopic images |
Train | 6585 images |
Test | 840 images |
Classes | 0-6 |
Class names (labels) | (AKIEC) Actinic Keratosis’, ‘Basal Cell Carcinoma (BCC)’, ‘Benign Keratosis (BKL)’, ‘Dermatofibroma (DF)’, ‘Melanocytic nevus (NV)’, ‘Melanoma (MEL)’, ‘Vascular lesion (VASC) |
Models Utilized Across Diverse Domains:
Domain | Dataset | Models Used |
---|---|---|
Fashion | FashionMNIST dataset | Custom, VGG16, MobileNET |
Radiology | Pneumonia dataset | Custom, VGG16, MobileNET |
Dermatology | Skin lesion dataset | Custom, VGG16, MobileNET |
Our study shows that TinyML is a powerful framework for deploying deep learning models on edge devices. It offers a balance between efficiency and accuracy making it ideal for diverse applications in computer vision, natural language processing, and beyond.
1. Accuracy vs Inference time vs Model size trade-off:
2. Inference time assessment:
3. Quantization for model size reduction:
Key Insights:
1. TinyML optimization effectiveness: Our research shows that TinyML is effective in significantly reducing size by quantization without compromising model fidelity. Among the TinyML framework's accuracy performers, VGG16 was the better-performing model but only if the parameters were tweaked to support it.
2. Effect on inference time: Overall, TinyML optimizations provide good inference speeds, even though some of them result in longer inference times. This emphasizes the significance of selecting optimization strategies carefully.
3. Quantization for size reduction: TinyML applications demonstrate impressive model size reductions, especially with Custom and MobileNet models. Additionally, using INT16 and INT8 optimizations significantly shrinks the size of VGG16 models, highlighting the value of quantization methods in TinyML.
4. Achieving balanced trade-offs for ideal implementation: For TinyML models to be deployed as effectively as possible, the ideal balance between accuracy, inference time, and model size must be navigated. To successfully traverse and optimize tradeoffs and ensure optimal performance across a variety of deployment circumstances, ongoing customization based on unique model requirements continues to be essential.
Srinivas has over 25 years of experience which spans across Consumer Electronics, Biomedical Instrumentation and Medical Imaging. He has led research and development teams, focused on end-to-end 3D/4D quantification applications and released several "concept to research to market" solutions. He also led a cross functional team to drive applied research, product development, human factors team, clinical research, external collaboration and innovation. He has garnered diverse sets of skill sets and problem challenges. and has over 25 Patent filings and 12 Patent Grants across varied domains, mentored over 30+ student projects, been a guide for over 10+ master thesis students, peer reviewer for papers and an IEEE Senior Member (2007).
Malavika is an Electronics and Instrumentation engineer with experience in biomedical Instrumentation. She also brings expertise in business analysis within supply chain management incorporating the tools in data analytics to devise solutions to optimize and track KPIs. She predominantly contributes to shaping impactful solutions for customer-facing endeavors in technology and business domains
https://keras.io/api/applications/vgg/
https://keras.io/api/applications/mobilenet/
For more information on this Whitepaper, please feel free to get in touch with Srinivas Rao Kudavelly at srinivasrao.kudavelly@cyient.com
Cyient (Estd: 1991, NSE: CYIENT) partners with over 300 customers, including 40% of the top 100 global innovators of 2023, to deliver intelligent engineering and technology solutions for creating a digital, autonomous, and sustainable future. As a company, Cyient is committed to designing a culturally inclusive, socially responsible, and environmentally sustainable Tomorrow Together with our stakeholders.
For more information, please visit www.cyient.com
The advent of 5G technology will revolutionize global farming landscapes and will open up multiple ways to establish and grow precision farming. The figure below shows that every element in modern agriculture once connected to a high speed and high throughput 5G cellular network, works in tandem with the other to optimize resources and maximize yield. The imagery generated from SAR and GPR demand throughput for transferring them to a distant and central location/data cloud. Similarly, to control farming equipment remotely, a low latency communications network in inevitable.
Figure 10. Uses of 5G technology in agriculture
Hyperautomation will continue to evolve and redefine industries. Here are a few trends that could shape its future:
Cloud-based hyperautomation platforms will become more accessible, allowing organizations of all sizes to leverage automation as a service. This democratization of technology will drive innovation across sectors.
Rather than replacing humans entirely, hyperautomation will focus on enhancing human capabilities.
Hyperautomation will be tailored to meet the specific needs of different industries. We can expect specialized solutions in sectors like healthcare, manufacturing, telecom, energy, and utilities addressing industry- specific challenges and requirements.
Advances in AI, ML, and Gen AI will lead to even more sophisticated cognitive capabilities, enabling systems to handle complex decision-making and problem- solving tasks.
IoT will become more tightly integrated with hyperautomation. Sensors and data from connected devices will be used to optimize and automate processes in real time.
Industries will increasingly collaborate and share best practices for hyperautomation implementation. This cross-pollination of ideas will accelerate innovation and adoption.
Governments and regulatory bodies will establish frameworks to address the ethical and legal implications of hyperautomation, ensuring a responsible and fair use of the technology.
In the future, we can expect to see even more changes in the way hyperautomation is used and implemented. Advances in IoT, blockchain, and quantum computing will open opportunities for hyperautomation to be applied in new domains and enable it to automate highly complex tasks and processes.
Vishnu Gaddam, an aerospace professional with Cyient since 2011, brings over 15 years of experience in the field. His journey began at GE Aviation, where he assessed both small and large aero engines for static and dynamic behavior. At Cyient, Vishnu works on complete aero engines and evaluates loads under normal and extreme operational conditions. He is a certified Project Management Professional (PMP), Certified Scrum Master (CSM), and Six Sigma Green Belt. As a subject matter expert, Vishnu collaborates across teams, fostering knowledge exchange.
Cyient (Estd: 1991, NSE: CYIENT) partners with over 300 customers, including 40% of the top 100 global innovators of 2023, to deliver intelligent engineering and technology solutions for creating a digital, autonomous, and sustainable future. As a company, Cyient is committed to designing a culturally inclusive, socially responsible, and environmentally sustainable Tomorrow Together with our stakeholders.
For more information, please visit www.cyient.com
Cyient (Estd: 1991, NSE: CYIENT)delivers Intelligent Engineering solutions for Digital, Autonomous and Sustainable Future
© Cyient 2024. All Rights Reserved.