APTISI Transactions on Technopreneurship (ATT) Vol. No. November 2025, pp. 779Oe792 E-ISSN: 2656-8888 | P-ISSN: 2655-8807. DOI:10. ye Deep Learning Technique for Interpretable Diagnosis of Polycystic Ovary Syndrome in Ultrasound Imaging Pratibha Pandey1 . Sumit Chaudhary2 . Xin Nie3* 1, 2 Department of Computer Science and Engineering. Uttaranchal University. India 3 School of Computer Science and Engineering. Wuhan Institute of Technology. China 1 pratibhapandey8502@gmail. com, 2 drchaudharysumit@gmail. com, 3 niexin@whu. *Corresponding Author Article Info ABSTRACT Article history: Polycystic Ovary Syndrome (PCOS) is a widespread condition in women that is associated with hormonal disorders, absentee or infrequent menstruation, and cyst formation on ovaries. The diagnostic process of PCOS is still difficult because of ambiguity in ultrasound images. Earlier approaches emphasize the detection and classification of follicles for ultrasound images and disregard interpretability, which obscures understanding the modelAos decisions and trustworthiness in critical medical applications. This research proposes a method of diagnosing PCOS in two stages. Initially, follicles are located with a YOLOv8 model that has multiscale feature dose attention. Follicles are subsequently classified with a ResNet50 model that has SE blocks. The last step is for Grad-CAM to show which features of the image were used by the classification model in order to explain its decision and provide meaningful insights regarding the modelAos predictions. Evaluation was performed on two publicly available The proposed method outperforms all other methods in follicle detection with 92% mean average precision . AP) and classification accuracies of 25% and 94. 56% on datasets 1 and 2, respectively. This result makes the proposed model a reliable and transparent technique for robust clinical applications. By providing a scalable, deployable, and interpretable diagnostic pipeline that can be integrated into AI-based health-tech platforms, the suggested method also fits with the objectives of technopreneurship incubator models. Submission June 15, 2025 Revised July 26, 2025 Accepted September 29, 2025 Published October 9, 2025 Keywords: PCOS Diagnosis Deep Learning Follicles Detection Interpretability Ultrasound Images This is an open access article under the CC BY 4. 0 license. DOI: https://doi. org/10. 34306/att. This is an open-access article under the CC-BY license . ttps://creativecommons. org/licenses/by/4. AAuthors retain all copyrights INTRODUCTION Polycystic Ovary Syndrome is defined as a multicystic disorder of the endocrine system that affects females in their reproductive years and tends to originate in the teenage years . The particular reason is still however, several factors such as family history, surroundings, and habits are noted to be impacted by the illness. This implies that it is made up of hormonally disturbed syndromes, abnormalities of the ovaries, and the degree of insulin sensitivity, and, as a result, the associated features include amenorrhea, hirsutism, change in body mass index, and polycystic ovaries . The recent WHO report indicates that PCOS is found in 8Ae13% of women, and 6Ae18% of young women of reproductive age are found AoexclusivelyAo linked to infertility. Findings show that up to 70% of women with this disorder are either sterile from nonfertility ovarian cysts or anovulatory . Journal homepage: https://att. id/index. php/att ye E-ISSN: 2656-8888 | P-ISSN: 2655-8807 Diagnosis of this condition, especially with images acquired under the ultrasound examination, is difficult since the disease is complicated as well as its presentation. Ovarian morphology is quite different for any one woman at any menstrual cycle and cannot be relied on to provide precise clinical diagnostic aids, as they are usually followed by subjective observations of the radiologist . Worse yet, components like low magnifying power and noisy pictures contribute more to obscuring other features under critical microscopy, like follicle margins and stromal mass. It also presents a big problem for auto-diagnostic systems since databases for training specific AI can be either inconsistent or missing, while many machine learning models themselves are too opaque to be employed practically . Such issues are why, for example, deep learning, better-optimized algorithms, and standardized data must be used to enhance diagnostic yield and the treatment utility of the obtained results . As we noted earlier, analyzing ultrasound images to diagnose PCOS presents some difficulty, and thus, following the difficulties observed above, some approaches have been established . One usual practice is to segment the images by employing complex segmentation methods for variation of the shape and structure of the ovaries. Size-dependent filtering, contour detection, texture analysis, and level-set segmentation assist in enhancing the perimeter of ovaries and follicles so that proper dimensions can be measured for diagnosis . Other genomic techniques, which include the evaluation of texture changes in the ovarian tissue and follicular regions, are also used in the assessment of PCOS from normal ovaries. These methods, however, have limitations. They are sensitive to image noise, low contrast, and irregular shapes. They also need stress on parameters and frequently fail to perform in complicated or overlapping areas, therefore raising issues concerning their stability . Although many works are aimed at diagnosing PCOS or jointly detecting and classifying from the ultrasound image, till now no one considers the interpretability of the deep learning models. Traditionally, interpretability is highly valued in the medical field, as it enables one to understand why the predictions were made or why the model reaches a certain level of accuracy . This fits especially to trust-building as it determines the rationale of the doctor, and the choices made were made in line with the medical expertise. That way, when the model errs, interpretability indicates issues such as biased data or otherwise, features that the model did not incorporate, which would then lead to enhancements . Finally, interpretable models provide more trustworthy decisions, which are beneficial both for safety and for better clinical practice. The suggested interpretable AI framework provides a deployable, accurate, and explicable diagnostic tool, bridging the gap between academic innovation and healthcare startup potential. This speeds up translation into practical applications and opens doors for scalable clinical solutions, particularly in tech-driven health diagnostics . This paper resolves the challenges in PCOS diagnosis by introducing a combined approach for detection, classification, and interpretability. The research proposes a new approach that incorporates or integrates YoloV8 with Attention Guided Multi-Scale Feature Fusion (AGMSFF) to develop a more accurate model capable of detecting follicles in PCOS ultrasound images and optimizing feature selection to improve accuracy . The images detected are further processed using a classification model based on ResNet50 with a Squeeze-andExcitation (SE) block. The SE block improves classification performance by adaptively amplifying essential features, thus yielding more accurate outcomes. A Grad-CAM interpretability method is also implemented on the classification model so that users such as clinicians and researchers can understand the rationale behind the modelAos predictions visually and therefore trust the modelAos decisions, enhancing its reliability and usability in medicine . The primary contributions of this paper are as follows: A A new method for detection using YoloV8 fused with Attention-Guided Multi-Scale Feature Fusion to accurately detect abnormalities in ultrasound imaging of PCOS was developed . A A classification approach based on ResNet50 was developed by adding Squeeze-and-Excitation (SE) blocks that markedly enhanced performance. Additionally, implemented an interpretability technique for the proposed model, providing transparent insights into its decision-making process and fostering greater trust in its application . A The incorporation of Grad-CAM for interpretability, which is rarely addressed in prior PCOS studies . The rest of this paper is organized as follows: Section 2. presents an existing study on different classification and detection techniques. Section 3. explains the methodology, including the proposed method APTISI Transactions on Technopreneurship (ATT). Vol. No. November 2025, pp. 779Ae792 APTISI Transactions on Technopreneurship (ATT) ye . Section 4. derives experimental results. Finally. Section 6. concludes with remarks on the paperAos LITERATURE REVIEW Due to complex and time-consuming PCOS diagnosis challenges caused by diverse symptom combinations, costly biochemical tests, and ovary scanning, . suggested an AI-based computational approach for classifying PCOS. Initially, the correlation-based feature extraction method was applied to clinical data, reducing 45 features to 17, with SVM achieving the highest accuracy among ML algorithms. Furthermore. VGG16 outperformed the CNN model on ultrasound image analysis . Similarly, . suggested a DL model for automated PCOM detection using Inception architecture by combining ovary ultrasound images with patient clinical data and a fusion model developed by integrating image and clinical features for PCOS diagnosis, leveraging Mobile Net for feature extraction. Furthermore, due to the issue of weak contrast, speckle noise, and hazy boundaries in ultrasonography, which hinder accurate segmentation and classification. Sha proposed a guided trilateral filter for noise reduction, an AdaResU-Net for precise segmentation, and a pyramidal dilated convolutional network for cyst classification, optimized using the Wild Horse Optimization algorithm to enhance diagnostic precision and reliability . At the same time, . developed a denoising filter using the Two-Dimensional Fractional Fourier Transform . D-FrFT) to remove artifacts in the time-frequency plane that optimizes the fractional operator parameter via a VGG-16 model, effectively reducing noise and enhancing image quality. proposed an active contour method combined with a modified Otsu method to automate follicle detection of PCOS via preprocessing using traditional filters to reduce noise, with the optimal filter selected to enhance image quality, addressing the challenges posed by varying follicle sizes and their relationships with blood vessels and tissues, which often lead to errors in manual interpretation. explained the shortcomings of an automated system powered with Fuzzy Rule-Based Convolutional Neural Networks (FCNN) for cyst detection from ultrasound An automated system that utilizes FCNN improves consistency in clinical decision-making by assisting physicians with cyst detection. Some other studies proposed a feature selection method with tri-stage wrapper approach that optimizes feature relevance, improving classification accuracy while minimizing computational resources . In a different study, a method comprising four stages was developed: preprocessing, segmentation with an enhanced watershed algorithm, follicle detection with SCBOD feature recognition, and PCOS versus Non-PCOS classification with SVM. This method solves issues related to speckle noise, time complexity, and misclassification based on the size and number of follicles . Another approach addressed the constraints of manual cyst detection and error-prone machine learning techniques by using a hybrid strategy that combines VGGNet16 as a feature extractor and XGBoost as a meta-learner to improve the reliability and efficiency of diagnosing PCOS from ovarian USG images . Non-invasive clinical parameters were employed and various ML algorithms were assessed to detect PCOS, demonstrating the efficacy of the RF model for screening and validating its predictive ability using the Out-of-Bag (OOB) error. To improve diagnostic accuracy and processing efficiency, traditional and ensemble classifiers were utilized along with filter, embedded, and wrapper-based feature selection approaches, which demonstrated that the Ensemble RF classifier, combined with the embedded feature selection method, outperformed other classifiers in terms of accuracy and sensitivity, achieving the best performance and efficacy in medical diagnosis tasks . Recent important studies have explored advancements in medical imaging technology to improve the robustness and enhancement. A detailed comprehensive survey highlighted novel techniques and remote applications that are efficient in medical imaging technology. The role of telemedicine in mobile applications called Halodoc enhanced patient-doctor consultations while ensuring data privacy and efficiency. Another study explored the use of gradient centralization in conjunction with advanced preprocessing techniques for enhanced performance in medical image analysis . RESEARCH METHODS The below Figure 1 shows the proposed method for detecting and classifying PCOS from ultrasound First, images from a dataset undergo preprocessing, which includes normalization, noise reduction, and reshaping . Data augmentation is also applied to enhance the training data. Next, follicles in the images are detected using YOLOv8 with specialized modules AGMSFF . The detected regions are then classified using a ResNet50 model with SE blocks, which improves accuracy by focusing on important features. E-ISSN: 2656-8888 | P-ISSN: 2655-8807 The system classifies the image as either AyinfectedAy or Aynot infectedAy and uses Grad-CAM for interpretability, helping to visualize which parts of the image influenced the decision . The enhanced architectural efficiency and better real-time object detection capabilities. YOLOv8 was chosen as the best option for detecting small and variable-sized follicles. Because of its powerful feature extraction capabilities and SE-ResNet50 was chosen for its strong feature extraction capability, enhanced by Squeeze-and-Excitation blocks that emphasize informative channels crucial for classifying subtle differences in ultrasound images . The detailed explanation of each is provided below: Figure 1. Block diagram of proposed denoise method Dataset Description and Noise Addition This study is based on a dataset retrieved from the Kaggle repository, which consists of 1,932 images for training and 1,924 images for testing . The overlap between the training and testing subsets was significant in the existing dataset. hence, the test set was discarded, and the entire dataset put aside for testing was utilized for training. This dataset was later partitioned into new training and test datasets . It contains ultrasound images labelled as INFECTED . images showing cystic ovarie. and NOT INFECTED . ,143 images of healthy ovarie. These labels indicate people suffering from PCOS and those who do not. This classification is particularly useful in the context of real-time medical systems because it enables automated and precise diagnosis of PCOS . The study also adds the dataset PCOSGen collected from multiple online This dataset includes 1,468 unhealthy samples and 3,200 healthy samples, which were split into training and testing sets. A New Delhi-based gynaecologist annotated all the samples . Sample images of both of the classes are provided in Figure 2. Figure 2. Different Sample images from both classes Data Pre-Processing At this stage, the ultrasound images are first normalized by drawing the pixel intensity values within the scale of 0 and 1. To mitigate distortion, a median filter is employed which is known to efficiently eliminate speckle noise present in ultrasound imagery . The filter works by taking a given pixelAos value and replacing APTISI Transactions on Technopreneurship (ATT). Vol. No. November 2025, pp. 779Ae792 APTISI Transactions on Technopreneurship (ATT) ye it with the median of the specified pixels around it in the area, making it possible to lessen noise without losing the vital details in the image. Lastly, in order to standardize the images that may differ in size, all images are resized to 128x128x3 . Data Augmentation This serves as a crucial step for the case of computer vision that assists in augmenting the scope of the dataset and also the diversity, helping the models in generalizations and performing accurately in practical scenarios . In this research zooming, rotation, flipping and sharing augmentations were performed. These image alterations replicated real-life situations where an objectAos position and orientation can change, providing numerous additional images that modify the original images . This not only improves the size of the dataset aiming to curb overfitting but guarantees that the model captures various distinctive attributes, enhancing the reliable performance for diverse inputs . The below Figure 3 illustrates the different augmentation techniques used on the images. Figure 3. Various augmentation techniques applied on original images Follicle Detection Before doing the classification task, the image detection of PCOS diagnosis Domain is crucial. Follicles in ovaries are perhaps one of the most significant grounds for diagnosing PCOS . Certainly, definite identification of these follicles in ultrasound images is critical for identification of ROIs which are very pertinent to abnormal ovarian morphology. In this paper, to recognize the follicles from the ultrasound image. YOLOv8 (You Only Look Once Version . with AGMSFF module is utilized . This modification is making it easier for the model to different sizes of follicles from ultrasound image and fine-grained details. The AGMSFF is introduced and incorporated into the structure of YOLOv8 as a feature extractor and feature fusion While it uses an Attention Mechanism alongside Multi-Scale Feature Fusion to fine-tune the outcome of the predictions. The attention mechanism learns the weights with respect to spatial and channel dimensions, noteworthy are follicles boundaries, and negations are given to unimportant parts . Mathematically, the attention mechanism can be expressed as: Fatt = E (Ws A GAP (F )) Oo F Where F is the input feature map. E denotes the sigmoid activation. Ws represents learnable parameters, and GAP (F )(A) is the global average pooling operation . The resulting attention-refined feature map Fatt enhances the detection of follicles-like regions. Multi-scale feature fusion combines features from different layers of the model, allowing the detection head to capture follicles of varying sizes. Given feature maps F1 . F2 . F3 from different scales, the fusion is performed as: Ff used = F1 U p(F2 ) U p(U p(F3 )) . Where U p(A) represents up-sampling to ensure spatial alignment. This multi-scale fusion ensures that the model retains both fine details and broader contextual information, critical for detecting small and irregularly shaped follicles . The improved YOLOv8 with AGMSFF processes ultrasound images to output bounding boxes for detected follicles. Each bounding box is associated with a confidence score S and a class label C, calculated as: Sc = E (Wc A Ff used ) . Where Wc are classification weights. This detection output is then passed to the classification model for further analysis. By incorporating this, it enhances the detection performance for follicles identification. E-ISSN: 2656-8888 | P-ISSN: 2655-8807 ye Classification Once follicles are accurately detected, the next step is classification to determine whether an ovary is normal or abnormal. Normal images typically lack follicles, whereas abnormal images contain detectable follicles, a hallmark of PCOS . This classification is critical in distinguishing healthy individuals from those requiring further medical attention. To perform this task, we propose a Modified ResNet50 with Squeeze-andExcitation Blocks (SE-ResNet. , designed to handle the complexities of ultrasound imaging by improving feature extraction and focusing on relevant regions . The ResNet50 architecture is enhanced by integrating Squeeze-and-Excitation (SE) Blocks after each residual block. The SE blocks adaptively recalibrate the feature maps to prioritize the most informative channels, significantly improving the modelAos ability to classify subtle differences between normal and abnormal ovarian structures. The SE block operates in two stages: squeeze and For a feature map F OO RHyW yC , where H. W , and C represent the height, width, and number of channels, the squeeze step computes global context by applying Global Average Pooling (GAP) across spatial H X Fi,j,c H y W i=1 j=1 . This produces a channel-wise descriptor z OO RC . In the excitation step, fully connected layers with activation functions learn channel-wise dependencies and generate attention weights s: s = E (W2 A RELU (W1 A . ) . Here. W1 . W2 are learnable weights, and E is the sigmoid activation. The feature map is recalibrated by scaling each channel with its corresponding weight: Fi,j,c = sc A Fi,j,c . This allows the model to emphasize critical features, such as cyst-like patterns, while suppressing irrelevant noise. The modified ResNet50 extracts features from the ultrasound images while the SE blocks recalibrate the feature maps at each residual stage . The final layer performs binary classification into normal or abnormal categories, with a SoftMax output: y = softmax(Wf A Ff inal bf ) . Where Wf and bf are the weights and bias of the final fully connected layer, and Ff inal represents the output feature vector. This integration provides effective classification results over existing methods . These improvements make SE-ResNet50 particularly effective for accurate classification of normal and abnormal ovaries in PCOS diagnosis. Interpretability Concerning decision making, interpretability plays a decisive role in medical AI and is especially important when using AI for PCOS detection owing to the need for its transparent, justified decisions that will be acceptable to clinicians and follow logical reasoning . Superimposing areas on the ultrasound images on which a model bases its decision helps build confidence and accuracy, thereby negating biases and other useless areas . Nevertheless, most PCOS detection studies focus more on the detection error than on its interpretability keeping out clinical application where explainable results are significant. To fill this gap, we incorporate Gradient-weighted Class Activation Mapping (Grad-CAM) to identify portions of images guiding modelsAo predictions while improving the diagnostics legitimacy . Grad-CAM works by utilizing the gradients of the output class score y c with respect to the feature maps A of the last convolutional layer. The importance weights kc for each feature map are computed as: kc = 1 X X OCy c Z i j OCAkij Where Z is the spatial dimension of the feature map, and Ak represents the activation at spatial location . , . The Grad-CAM heatmap Lc is then generated by a weighted combination of the feature maps: APTISI Transactions on Technopreneurship (ATT). Vol. No. November 2025, pp. 779Ae792 APTISI Transactions on Technopreneurship (ATT) L = ReLU kc Ak The ReLU function ensures that only features positively influencing the class score are highlighted. The resulting heatmap is overlaid on the original image to provide a visual explanation of the modelAos decision . Using Grad-CAM, our system enables clinicians to verify whether the modelAos focus aligns with diagnostically relevant areas, such as cysts in abnormal images or uniform ovarian structures in normal cases . This added layer of interpretability ensures that the classification model is not only accurate but also reliable and transparent, making it suitable for real-world clinical applications . RESULTS AND DISCUSSION This section describes the experimental procedure, including dataset preparation, different hyper parameters that used, model training, and evaluation using various performance metrics, along with a comparative analysis of results. Experimental Setup In this experiment two publicly accessible ultrasound image databases were used. Since carrying out feature extraction, the dataset was split into two. for training there were 2,685 images while for testing 671 images were set up. It predisposed with follicles detection, then followed by classification, before having interpretability analysis. This approach provides a general overview of the ability of the model in the identification and categorization of images, especially under noisy environments. Through checks based on such conditions, this method guarantees that the model will be effective in real-world settings. System and Hyperparameter Specifications The proposed model train on a system that has IntelA XeonA Silver 4216 CPU, 128 GB of RAM, and an NvidiaA RTX A4000 GPU with 16 GB of VRAM. Here loss function Mean Squared Error (MSE) is used and Adaptive Moment Estimation (Ada. which integrates momentum and adaptive learning rates for effective convergence used as optimizer function for train the denoise model. To balance stability and computational efficiency, the batch size is 64 and the learning rate is set to 0. Model train on 50 epochs which enable the model to learn from the image dataset and classify the labels, improving its generalization abilities for practical uses. Performance Evaluation Metrics In this paper, the proposed model is evaluated using various performance metrics. For follicle detection, the mean Average Precision . AP) score is utilized, which measures the precision of the model at different recall thresholds, providing an overall assessment of detection accuracy. In the classification task, metrics such as accuracy . he percentage of correctly classified instance. , precision . he proportion of true positives among predicted positive. , recall . he ability to correctly identify all positive instance. , and F1-score . he harmonic mean of precision and recal. are employed. Additionally, the ROC-AUC curve . llustrating the trade-off between true positive rate and false positive rat. and the Confusion Matrix . howing the distribution of predicted and actual classe. are used for visual representation of the results. Experimental Results for Follicles Detection Detection of follicles has been done using the different methods in this paper with the inclusion of our proposed method. The below Figure 4 shows the achieved detection approaches for ovarian follicles in ultrasound images which include OCD-FCNN . YOLOv8. Attention-Guided Multi-Scale CNN, and a proposed All the techniques are then benchmarked against ground truth annotations to assess their efficacy. There is also a model called OCD-FCNN for ovarian follicles detection and classification which is constructed with a limited performance of the above index, the mAP is just 55%. This suggests that it may not well identify and classify ovarian follicles which consequently give incomplete or imprecise prediction. Even with a 78% mAP. YOLOv8 succeeds in detection but does not identify all the targets or achieve comprehensive follicles In the same manner, the proposed Attention-Guided Multi-Scale CNN yielded comparable performance to YOLOv8, which shows that AMSCNN cannot outcompete YOLOv8 with regards to detecting finely nuanced details of follicles. E-ISSN: 2656-8888 | P-ISSN: 2655-8807 Figure 4. Detection result of proposed method along with different state of art methods Unlike this, the proposed method is a fusion of Attention-Guided Multi-Scale CNN and YOLOv8 for more efficient operation since both models have impressive characteristics that add to the efficiency of the The integration of YOLOv8, an improved object detection algorithm, with the Attention-Guided mechanism, leads to improved detection accuracy and robustness, with an added advantage of a 92% boost in mean Average Precision . AP). The proposed method is accurate in locating ovarian follicles and obtains better results in terms of precision and coverage, proving its applicability in more complicated medical image analysis operations. Experimental Results for Classification The proposed classification method demonstrates exceptional effectiveness, achieving the highest accuracy on both Dataset 1 and Dataset 2, as shown in Table 1. On Dataset1, it reached 98. 25%, outperforming ResNet50 . 69%). DenseNet121 . 87%), and VGG16 . 26%). Incorporating SE blocks significantly boosted baseline models, with DenseNet121 improving to 94. 65% . 87%) and VGG16 to 91. 26%). Similarly, on Dataset2, the proposed method achieved 94. 56%, surpassing ResNet50 . 21%), DenseNet121 . 33%), and VGG16 . 66%). SE blocks consistently enhanced feature representation, enabling models to capture subtle data patterns and generalize better. This led to an improvement of 2. 56% over ResNet50 and 3. 6% over DenseNet121 . ith SE block. on Dataset 1, and a 4. 35% and 4. 45% improvement on Dataset 2, respectively. These results confirm that SE blocks recalibrate channel-wise responses to focus on critical features, advancing the state of the art in classification tasks. Table 1. Performance evaluation of proposed classification method with state of art methods. Dataset Models Accuracy (%) Precision (%) Recall (%) F1-Score (%) Vgg16 Resnet50 Densenet121 Dataset1 Vgg16 with SE Blocks Densenet121 with SE Blocks Resnet50 with SE blocks (Proposed Metho. Vgg16 Resnet50 Densenet121 Dataset2 Vgg16 with SE Blocks Densenet121 with SE Blocks Resnet50 with SE blocks (Proposed Metho. The below Figure 5 represents the proposed modelAos training and validation accuracy of two datasets. the left plot corresponds to the first dataset while the right plot is for the second dataset. The blue color in each APTISI Transactions on Technopreneurship (ATT). Vol. No. November 2025, pp. 779Ae792 APTISI Transactions on Technopreneurship (ATT) ye plot represents the training accuracy, while the orange one represents the validation accuracy. The model shows frequent growth in the first epochs and then gradually converges to the optimal accuracy with the increase of The training and the validation accuracy are equally high, and in that sense, the model may be claimed to generalize well, with little risk for overfitting. Figure 5. Accuracy Vs Epoch graph of proposed model on both datasets The ROC curve Figure 6 illustrates the performance of five models in binary classification tasks, comparing their ability to distinguish between two classes on both datasets. In the Dataset 1 Proposed Method achieves the best performance with a perfect Area Under the Curve (AUC) of 100%, followed by DenseNet SE with an AUC of 98%. VGG16 and ResNet50 perform moderately well, each with an AUC of 88%, while DenseNet has a slightly lower AUC of 85%. Alos in Dataset 2 Proposed method have AUC of 92% which signifies the proposed model generalizability. The closer a curve is to the top-left corner, the better the model performs, and all models outperform random guessing . he diagonal lin. This demonstrates the Proposed MethodAos superiority in classification accuracy compared to the others. Figure 6. ROC curve of proposed method along with other methods on both datasets Table 2 below compares the performance of existing techniques with the proposed model across two datasets in terms of accuracy. For Dataset 1, previous methods demonstrate high accuracy, but the proposed method surpasses them with an accuracy of 98. 25%, showcasing its superior performance. Similarly, for Dataset 2, existing techniques achieve notable accuracy, but the proposed method achieves the highest accuracy at 94. These results highlight the proposed modelAos effectiveness in outperforming state-of-the-art approaches across different datasets. Table 2. Performance evaluation of proposed classification methods with existing methods. Dataset Models Accuracy (%) . Dataset1 . Proposed Method Dataset2 . Proposed Method E-ISSN: 2656-8888 | P-ISSN: 2655-8807 Experimental Results for Interpretability The provided figure illustrates the interpretability of the proposed model using Grad-CAM visualization applied to the last layer of the network. The below Figure 6 shows the Grad-CAM representation of proposed model on abnormal images. Interpretability is essential in deep learning, particularly in medical imaging tasks like ultrasound analysis, to ensure that the modelAos decisions are transparent and clinically reliable. By visualizing the regions of focus within the image. Grad-CAM helps validate the modelAos behavior, ensuring its predictions align with human expert understanding, which is critical for establishing trust and enhancing decision-making in sensitive domains. Grad-CAM is a widely used method for generating visual explanations of a modelAos predictions. highlights the regions in the input image that most strongly influence the modelAos decision. By computing the gradients of the predicted class score with respect to the feature maps of a specific layer. Grad-CAM produces a heatmap that overlays the original image, indicating areas of high importance. In the top row of the figure, the grayscale ultrasound images are presented, while the corresponding Grad-CAM visualizations in the bottom row highlight the regions of interest . , cysts or abnormalitie. in color-coded heatmaps, ranging from cool . ess importan. to warm . ore importan. These visualizations confirm that the model effectively focuses on diagnostically relevant structures, demonstrating its robustness and interpretability. Discussion This paper attempts to classify the PCOS from the ultrasound images by relating to problems of poor image quality and the inherent nature of the disease. The previous approaches share the disadvantage that the follicle is not detected. hence, the feature set is insufficient for follicle detection. To counter this, here we present a YOLOv8 model-based attention-guided multiscale approach for follicle detection to better understand the specific regions of interest and provide better input to the classification. The classification stage uses a ResNet50 model enhanced with SE blocks to increase feature representation and pay attention to the most important features. This two-stage system of biological prediction greatly improves the accuracy of the rehabilitative model. Further, the model aims for interoperability, which could be very crucial during the diagnostic analysis using deep learning-based medical-image analysis. It thus ensures that detection of follicles is included while at the same time acting as a way of providing explication of classification results to practitioners. This approach does not only provide better accuracy. it also fulfils the gaps in the existing approaches by including efficient detection and highly interpretable detection as a definitive fixed standard for assessing the PCOS from the ultrasound images. Nevertheless, the proposed model possesses several drawbacks. During the detection stage, the proposed YOLOv8 attention-guided multiscale model may fail to detect very small or poor visibility follicle candidates in low-quality ultrasound images, which may limit its applicability across different clinic settings. the classification stage, adding SE blocks into the ResNet50 architecture significantly improves the ability of the network to learn features. however, it also increases the number of computations. This leads to training and inferring much higher time than full-precision limits the use of this model for real-time or devices with low computational capability. These limitations give us ideas for improvement that can be focused on increasing the practical usefulness of such systems. Additionally, we potential deployment using edge devices or cloud-based systems to ensure usability across diverse healthcare environments, including low-resource settings. Additionally, the proposed AI-based denoising model directly contributes to SDG 3 (Good Health and Well-bein. by facilitating the early diagnosis of Polycystic Ovary Syndrome (PCOS). It aligns with SDG 9 (Industry. Innovation, and Infrastructur. by fostering AI innovation in ultrasound imaging. MANAGERIAL IMPLICATIONS The proposed deep learning-based diagnostic tool for PCOS has important implications for healthcare management and strategy. The integration of AI into clinical practices can bring substantial improvements in efficiency, cost-effectiveness, trust, and scalability across various healthcare environments. Enhancing Diagnostic Efficiency and Cost-Effectiveness The introduction of AI tools into healthcare workflows significantly improves diagnostic efficiency, which is critical for high-volume healthcare settings. The AI system reduces the time needed for accurate PCOS diagnosis by automating the process of follicle detection and classification. This allows healthcare providers to process more cases, leading to improved patient throughput and reduced wait times. APTISI Transactions on Technopreneurship (ATT). Vol. No. November 2025, pp. 779Ae792 APTISI Transactions on Technopreneurship (ATT) ye Furthermore. AIAos ability to automate diagnosis can result in considerable cost savings. By reducing reliance on specialized medical personnel, such as radiologists, healthcare organizations can lower labor costs. Additionally, the systemAos accuracy reduces the likelihood of misdiagnoses and unnecessary treatments, further contributing to cost efficiency. In environments where budget constraints are common, such AI-driven solutions provide an economically sustainable alternative to manual diagnostic methods. Fostering Trust and Collaboration Through Interpretability A significant barrier to AI adoption in healthcare is clinician trust. The incorporation of interpretability features, such as Grad-CAM, addresses this issue by providing clinicians with a clear rationale behind the AI modelAos predictions. By visually highlighting the areas of the ultrasound image that influenced the modelAos decision. Grad-CAM enhances transparency and allows clinicians to verify that the model is focusing on clinically relevant features, such as cysts or follicles. This interpretability builds trust between clinicians and the AI system, fostering a collaborative environment where AI supports healthcare professionals in making informed decisions. With clear explanations of AIAos reasoning, healthcare providers can be more confident in relying on AI-assisted diagnostics, leading to more efficient decision-making and better patient outcomes. CONCLUSION This work proposes a new dual-stage model of PCOS diagnosis that achieves superior results by incorporating both follicle detection and classification tasks. In the detection stage, the YOLOv8-based attentionguided multiscale model gives an mAP of 92%, improving the predicted area of interest. The subsequent classification step, performed utilizing ResNet50 architecture with SE blocks, attains exceptional accuracy levels of 98. 25% for Dataset 1 and 94. 56% for Dataset 2, thereby evidencing the solidity of the method proposed. Not only does the integration of detection and classification increase diagnostic accuracy, but it also offers increased interpretability to practitioners. The future research relatable to this work can involve the study of the use of this method in large and diverse databases and the inclusion of this approach in real-time clinical Future work will focus on the potential deployment of our model on edge devices or cloud-based platforms to ensure usability across diverse healthcare environments, including low-resource settings. DECLARATIONS About Authors Pratibha Pandey (PP) https://orcid. org/0000-0002-3122-4352 Sumit Chaudhary (SC) https://orcid. org/0000-0002-5455-774X Xin Nie (XN) https://orcid. org/0000-0002-5005-9145 Author Contributions Conceptualization: PP. Methodology: SC. Validation: PP and SC. Writing Original Draft Preparation: PP and SC. Writing Review and Editing: SC and XN. Visualization: SC. All authors. PP. SC, and XN, have read and agreed to the published version of the manuscript. Data Availability Statement The data presented in this study are available on request from the corresponding author. Funding This work was supported by a grant from the Hubei Key Laboratory of Intelligent Robot of China (Grant No. HBIRL202. Declaration of Conflicting Interest The authors declare that they have no conflicts of interest, known competing financial interests, or personal relationships that could have influenced the work reported in this paper. E-ISSN: 2656-8888 | P-ISSN: 2655-8807 REFERENCES