Machine Learning-Based Business Rule Engine Data Transformation over High-Speed Networks
Raw data processing is a key business operation. Business-specific rules determine how the raw data should be transformed into business-required formats. When source data continuously changes its formats and has keying errors and invalid data, then the effectiveness of the data transformation is a big challenge. The conventional data extraction and transformation technique produces a delay in handling such data because of continuous fluctuations in data formats and requires continuous development of a business rule engine. The best business rule engines require near real-time detection of business rule and data transformation mechanisms utilizing machine learning classification models. Since data is combined from numerous sources and older systems, it is challenging to categorize and cluster the data and apply suitable business rules to turn raw data into the businessrequired format. This paper proposes a methodology for designing ensemble machine learning techniques and approaches for classifying and segmenting registered numbers of registered title records to choose the most suitable business rule that can convert the registered number into the format the business expects, allowing businesses to provide customers with the most recent data in less time. This study evaluates the suggested model by gathering sample data and analyzing classification machine learning (ML) models to determine the relevant business rule. Experimentation employed Python, R, SQL stored procedures, Impala scripts, and Datameer tools.
KeywordsCRISP, DM, data mining algorithms, business rules, prediction, classification, machine learning, deep learning, AI design, method,
References1. L. Contreras-Ochando, C. Ferri, J. Hernández-Orallo, F. Martínez-Plumed, M.J. Ramírez-Quintana, S. Katayama, Automated data transformation with inductive programming and dynamic background knowledge, [in:] Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2019, Würzburg, Germany, September 16–20, Proceedings, Part III, pp. 735–751, 2020, doi: 10.1007/978-3-030-46133-1_44.
2. W. Kratsch, J. Manderscheid, M. Röglinger, J. Seyfried, Machine learning in business process monitoring: A comparison of deep learning and classical approaches used for outcome prediction, Business and Information Systems Engineering, 63: 261–276, 2021, doi: 10.1007/s12599-020-00645-0.
3. A. Guha, D. Samanta, Hybrid approach to document anomaly detection: an application to facilitate RPA in title insurance, International Journal of Automation and Computing, 18: 55–72, 2021, doi: 10.1007/s11633-020-1247-y.
4. S. Sun, Z. Cao, H. Zhu, J. Zhao, A survey of optimization methods from a machine learning perspective, IEEE Transactions on Cybernetics, 50(8): 3668–3681, 2020, doi: 10.1109/TCYB.2019.2950779.
5. R. Nagar, Y. Singh, A literature survey on machine learning algorithms, Journal of Emerging Technologies and Innovative Research, 6(4): 471–474, 2019, https://www.jetir.org/view?paper=JETIR1904C77.
6. D. Ndirangu, W. Mwangi, L. Nderu, A hybrid ensemble method for multiclass classification and outlier detection, International Journal of Sciences: Basic and Applied Research, 45(1): 192–213, 2019, https://www.gssrr.org/index.php/JournalOfBasicAndApplied/article/view/9904.
7. N.A. Wahid, T.N. Adi, H. Bae, Y. Choi, Predictive business process monitoring – Remaining time prediction using deep neural network with entity embedding, [in:] The Fifth Information Systems International Conference: Elsevier Procedia Computer Science, Surabaya, Indonesia, July 23–24, 161: 1080–1088, 2019, doi: 10.1016/j.procs.2019.11.219.
8. R. Makani, B.V.R. Reddy, Taxonomy of machine leaning based anomaly detection and its suitability, [in:] International Conference on Computational Intelligence and Data Science: Elsevier Procedia Computer Science, Ohio, USA, April 7–8, 132: 1842–1849, 2018, doi: 10.1016/j.procs.2018.05.133.
9. J. Zhang, Advancements of outlier detection: A survey, ICST Transactions on Scalable Information Systems, 13(1–3): 1–26, 2013, doi: 10.4108/trans.sis.2013.01-03.e2.
10. N. Biswas, S. Chattapadhyay, G. Mahapatra, S. Chatterjee, K.C. Mondal, A new approach for conceptual ETL process modeling, International Journal of Ambient Computing and Intelligence, 10(1): 30–45, 2009, doi: 10.4018/IJACI.2019010102.
11. G.M.D. Sree, S. Vasundra, Vector-based classification prediction to geographical location, International Journal of Future Generation Communication and Networking, 13(4): 4174–4179, 2020.