Data-Mining-in-Healthcare: diferenças entre revisões
Sem resumo de edição |
Sem resumo de edição |
||
Linha 25: | Linha 25: | ||
In the context of healthcare, this timeline provides an analogous framework for understanding how data mining and AI can revolutionize decision-making, diagnostics, and operational efficiency by leveraging advancements in real-time monitoring, predictive analytics, and data integration. | In the context of healthcare, this timeline provides an analogous framework for understanding how data mining and AI can revolutionize decision-making, diagnostics, and operational efficiency by leveraging advancements in real-time monitoring, predictive analytics, and data integration. | ||
= Key Data Mining Techniques in Healthcare = | = Key Data Mining Techniques in Healthcare <ref name="dm-techniques">Birjandi SM, Khasteh SH. A survey on data mining techniques used in medicine. J Diabetes Metab Disord. 2021 Aug 31;20(2):2055–71.</ref><ref name="dm-models">Wu WT, Li YJ, Feng AZ, Li L, Huang T, Xu AD, et al. Data mining in clinical big data: the frequently used databases, steps, and methodological models. Military Medical Research. 2021 Aug 11;8:44.</ref> = | ||
== Classification == | == Classification == | ||
Used for categorizing patients or medical conditions based on predefined labels (e.g., predicting disease risks). | |||
'''How it works:''' Algorithms are trained on labeled datasets to learn patterns and apply these insights to new data. | |||
'''Common algorithms:''' Decision Trees, Random Forest, Support Vector Machines (SVM), Neural Networks. | |||
'''Use cases in Healthcare:''' | |||
* Diabetes risk prediction based on patient history and demographics (Islam et al., 2018 <ref name="sys-review">Islam MS, Hasan MM, Wang X, Germack HD, Noor-E-Alam M. A Systematic Review on Healthcare Analytics: Application and Theoretical Perspective of Data Mining. Healthcare. 2018 May 23;6(2):54.</ref>); | |||
* Classification of medical images for cancer detection (e.g. using CT or MRI scans) (Islam et al., 2018 <ref name="sys-review">Islam MS, Hasan MM, Wang X, Germack HD, Noor-E-Alam M. A Systematic Review on Healthcare Analytics: Application and Theoretical Perspective of Data Mining. Healthcare. 2018 May 23;6(2):54.</ref>). | |||
== Regression == | == Regression == | ||
Focused on identifying relationships between variables to predict continuous outcomes (e.g. mortality rates or healthcare costs). | |||
'''Common algorithms:''' Linear Regression, Logistic Regression, Ridge and Lasso Regression. | |||
'''Use cases in Healthcare:''' | |||
* Estimating the length of hospital stays based on patient conditions; | |||
* Analyzing the impact of cholesterol levels on cardiovascular disease outcomes (Islam et al., 2018 <ref name="sys-review">Islam MS, Hasan MM, Wang X, Germack HD, Noor-E-Alam M. A Systematic Review on Healthcare Analytics: Application and Theoretical Perspective of Data Mining. Healthcare. 2018 May 23;6(2):54.</ref>). | |||
== Clustering == | == Clustering == | ||
Identifies natural groupings within data without predefined labels. This unsupervised technique is valuable for segmenting patients or detecting disease patterns. | |||
'''Common algorithms:''' K-Means, Hierarchical Clustering, DBSCAN (Density-Based Spatial Clustering). | |||
'''Use cases in Healthcare:''' | |||
* Patient segmentation for personalized treatments; | |||
* Identification of patterns in epidemic outbreaks based on geospatial data. | |||
== Anomaly Detection == | == Anomaly Detection == | ||
Used to detect outliers or irregular patterns in data; often used for monitoring and quality assurance. | |||
'''Common algorithms:''' Isolation Forest, Autoencoders, Statistical-Based Methods. | |||
'''Use cases in Healthcare:''' | |||
* Identifying fraudulent claims in insurance systems; | |||
* Detecting rare adverse drug reactions in pharmacovigilance studies. | |||
== Association Rules == | == Association Rules == | ||
Used to unveil frequent co-occurrences between variables (e.g. analyzing if the presence of condition A increases the likelihood of condition B). | |||
'''Common algorithms:''' Apriori, FP-Growth (Frequent Pattern Growth). | |||
'''Use cases in Healthcare:''' | |||
* Identifying frequently co-prescribed drugs to understand potential interactions; | |||
* Discovering risk factors associated with chronic diseases like hypertension. | |||
== Neural Networks and Deep Learning == | == Neural Networks and Deep Learning == | ||
Simulates human neural processes to handle highly complex, high-dimensional and unstructured data (e.g. medical imaging or genomics). | |||
'''Common algorithms:''' Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Autoencoders. | |||
'''Use cases in Healthcare:''' | |||
* Detection of diabetic retinopathy from retinal scans; | |||
* Prediction of clinical outcomes using unstructured EHR data. | |||
= Applications of Data Mining in Healthcare <ref name="sys-review">Islam MS, Hasan MM, Wang X, Germack HD, Noor-E-Alam M. A Systematic Review on Healthcare Analytics: Application and Theoretical Perspective of Data Mining. Healthcare. 2018 May 23;6(2):54.</ref><ref name="dm-hc-tech">Kolling ML, Furstenau LB, Sott MK, Rabaioli B, Ulmi PH, Bragazzi NL, et al. Data Mining in Healthcare: Applying Strategic Intelligence Techniques to Depict 25 Years of Research Development. Int J Environ Res Public Health. 2021 Mar 17;18(6):3099.</ref> = | |||
= Challenges in Healthcare Data Mining = | = Challenges in Healthcare Data Mining = | ||
= Future Directions of Data Mining in Healthcare = | = Future Directions of Data Mining in Healthcare = | ||
= References = | = References = | ||
<references /> | <references /> |
Revisão das 14h57min de 26 de dezembro de 2024
Authors (HEADS-INFORM 2025):
- Henrique Pereira
- Olívia Oliveira
Introduction
Data mining refers to the process of discovering meaningful patterns and extracting actionable knowledge from vast amounts of data, often with the aid of computational tools.
The healthcare sector generates massive amounts of data, including electronic health records (EHRs), genomic sequences, medical imaging, wearable device outputs, patient-generated data, and clinical trial data. These data sources provide a fertile ground for applying data mining to improve care delivery, patient outcomes, and operational efficiency (Islam et al., 2018 [1]).
Importance:
- Enhances clinical decision-making;
- Facilitates the early detection and prediction of diseases;
- Improves operational efficiencies in healthcare systems;
- Supports public health research and epidemiological studies.
The image above is presented in Gordan et al., 2022 [2], and illustrates a historical timeline of advancements in Structural Health Monitoring (SHM) and its integration with Artificial Intelligence (AI) and Data Mining technologies.
(image-evolution-dm)
The timeline traces significant developments across multiple decades, emphasizing the evolution of data mining techniques in SHM and highlighting their growing relevance from the 1990s onwards. Key milestones in this context include real-time data monitoring in 1992, operational modal analysis in 2001.
The recent focus, as seen in the "Data mining-based SHM" from 2014 to the present, underscores the role of data mining and AI in optimizing decision-making, predictive analysis, and enhancing the overall reliability of critical infrastructure.
In the context of healthcare, this timeline provides an analogous framework for understanding how data mining and AI can revolutionize decision-making, diagnostics, and operational efficiency by leveraging advancements in real-time monitoring, predictive analytics, and data integration.
Key Data Mining Techniques in Healthcare [3][4]
Classification
Used for categorizing patients or medical conditions based on predefined labels (e.g., predicting disease risks).
How it works: Algorithms are trained on labeled datasets to learn patterns and apply these insights to new data.
Common algorithms: Decision Trees, Random Forest, Support Vector Machines (SVM), Neural Networks.
Use cases in Healthcare:
- Diabetes risk prediction based on patient history and demographics (Islam et al., 2018 [5]);
- Classification of medical images for cancer detection (e.g. using CT or MRI scans) (Islam et al., 2018 [5]).
Regression
Focused on identifying relationships between variables to predict continuous outcomes (e.g. mortality rates or healthcare costs).
Common algorithms: Linear Regression, Logistic Regression, Ridge and Lasso Regression.
Use cases in Healthcare:
- Estimating the length of hospital stays based on patient conditions;
- Analyzing the impact of cholesterol levels on cardiovascular disease outcomes (Islam et al., 2018 [5]).
Clustering
Identifies natural groupings within data without predefined labels. This unsupervised technique is valuable for segmenting patients or detecting disease patterns.
Common algorithms: K-Means, Hierarchical Clustering, DBSCAN (Density-Based Spatial Clustering).
Use cases in Healthcare:
- Patient segmentation for personalized treatments;
- Identification of patterns in epidemic outbreaks based on geospatial data.
Anomaly Detection
Used to detect outliers or irregular patterns in data; often used for monitoring and quality assurance.
Common algorithms: Isolation Forest, Autoencoders, Statistical-Based Methods.
Use cases in Healthcare:
- Identifying fraudulent claims in insurance systems;
- Detecting rare adverse drug reactions in pharmacovigilance studies.
Association Rules
Used to unveil frequent co-occurrences between variables (e.g. analyzing if the presence of condition A increases the likelihood of condition B).
Common algorithms: Apriori, FP-Growth (Frequent Pattern Growth).
Use cases in Healthcare:
- Identifying frequently co-prescribed drugs to understand potential interactions;
- Discovering risk factors associated with chronic diseases like hypertension.
Neural Networks and Deep Learning
Simulates human neural processes to handle highly complex, high-dimensional and unstructured data (e.g. medical imaging or genomics).
Common algorithms: Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Autoencoders.
Use cases in Healthcare:
- Detection of diabetic retinopathy from retinal scans;
- Prediction of clinical outcomes using unstructured EHR data.
Applications of Data Mining in Healthcare [5][6]
Challenges in Healthcare Data Mining
Future Directions of Data Mining in Healthcare
References
- ↑ Islam MS, Hasan MM, Wang X, Germack HD, Noor-E-Alam M. A Systematic Review on Healthcare Analytics: Application and Theoretical Perspective of Data Mining. Healthcare. 2018 May 23;6(2):54.
- ↑ Gordan M, Sabbagh-Yazdi SR, Ismail Z, Ghaedi K, Carroll P, McCrum D, et al. State-of-the-art review on advancements of data mining in structural health monitoring. Measurement. 2022 Apr 1;193:110939.
- ↑ Birjandi SM, Khasteh SH. A survey on data mining techniques used in medicine. J Diabetes Metab Disord. 2021 Aug 31;20(2):2055–71.
- ↑ Wu WT, Li YJ, Feng AZ, Li L, Huang T, Xu AD, et al. Data mining in clinical big data: the frequently used databases, steps, and methodological models. Military Medical Research. 2021 Aug 11;8:44.
- ↑ 5,0 5,1 5,2 5,3 Islam MS, Hasan MM, Wang X, Germack HD, Noor-E-Alam M. A Systematic Review on Healthcare Analytics: Application and Theoretical Perspective of Data Mining. Healthcare. 2018 May 23;6(2):54.
- ↑ Kolling ML, Furstenau LB, Sott MK, Rabaioli B, Ulmi PH, Bragazzi NL, et al. Data Mining in Healthcare: Applying Strategic Intelligence Techniques to Depict 25 Years of Research Development. Int J Environ Res Public Health. 2021 Mar 17;18(6):3099.