Detection of electricity theft in developing countries-A machine learning approach

Main Article Content

Leonardo Grant*
Haniph Latchman
Kolapo Alli

Abstract

In developing countries, energy theft negatively affects the growth of utilities through loss of revenue and damage to the grid. The size and variety of the utility data set require extracting meaningful features to counter theft, which is difficult and computationally expensive. Recent developments have made machine learning more accessible to researchers, enabling its application in big data analysis for power utilities. Through greater access to training resources, as well as commercial and open-source machine learning tools, it has become easier to test large sets of data against various algorithms and automate many of the processes such as data cleaning and feature extraction, a procedure known as Automated Machine Learning (AutoML). These tools, along with frequent data collection by utilities, lend themselves to the use of machine learning to solve power grid issues such as anomaly detection. This paper focuses on feature extraction from monthly consumption records, previous investigations, and other customer information to detect power anomalies critical in the detection of theft. Using AutoML, features were extracted, and models were then trained and tested on data gathered from investigations. The results show that by using machine-learning algorithms, anomaly detection can be 4 times more effective than present manual detection techniques, increasing from 10% to 40% while reducing the number of unnecessary audit investigations by 61%.

Downloads

Download data is not yet available.

Article Details

Grant, L., Latchman, H., & Alli, K. (2023). Detection of electricity theft in developing countries-A machine learning approach. Trends in Computer Science and Information Technology, 8(2), 038–049. https://doi.org/10.17352/tcsit.000067
Research Articles

Copyright (c) 2023 Grant L, et al.

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Licensing and protecting the author rights is the central aim and core of the publishing business. Peertechz dedicates itself in making it easier for people to share and build upon the work of others while maintaining consistency with the rules of copyright. Peertechz licensing terms are formulated to facilitate reuse of the manuscripts published in journals to take maximum advantage of Open Access publication and for the purpose of disseminating knowledge.

We support 'libre' open access, which defines Open Access in true terms as free of charge online access along with usage rights. The usage rights are granted through the use of specific Creative Commons license.

Peertechz accomplice with- [CC BY 4.0]

Explanation

'CC' stands for Creative Commons license. 'BY' symbolizes that users have provided attribution to the creator that the published manuscripts can be used or shared. This license allows for redistribution, commercial and non-commercial, as long as it is passed along unchanged and in whole, with credit to the author.

Please take in notification that Creative Commons user licenses are non-revocable. We recommend authors to check if their funding body requires a specific license.

With this license, the authors are allowed that after publishing with Peertechz, they can share their research by posting a free draft copy of their article to any repository or website.
'CC BY' license observance:

License Name

Permission to read and download

Permission to display in a repository

Permission to translate

Commercial uses of manuscript

CC BY 4.0

Yes

Yes

Yes

Yes

The authors please note that Creative Commons license is focused on making creative works available for discovery and reuse. Creative Commons licenses provide an alternative to standard copyrights, allowing authors to specify ways that their works can be used without having to grant permission for each individual request. Others who want to reserve all of their rights under copyright law should not use CC licenses.

Dick A. Theft of electricity - how UK electricity companies detect and deter. European Convention on Security and Detection. 1995.

Antmann P. Reducing technical and non-technical losses in the power sector. Background Paper for the WBG Energy Strategy. Energy Unit World Bank. 2009; 1-34.

Grant L, Latchman H. A Data-Oriented Approach to the Problem of Power Grid Non-Technical Losses in Developing Countries. In: IMCIC’21: The 12th International Multi-Conference on Complexity, Informatics and Cybernetics. Orlando, Florida, USA: Proceedings; 2021.

Messinis GM, Hatziargyriou ND. Review of non-technical loss detection methods. Electric Power Systems Research. 2018; 158: 250-266.

Nimbargi S, Mhaisne S, Nangare S, Sinha M. Review on ami technology for smart meter. In: 2016 IEEE International Conference on Advances in Electronics, Communication and Computer Technology ICAECCT. 2016.

Saeed MS, Mustafa MW, Hamadneh NN, Alshammari NA, Sheikh UU, Jumani TA. Detection of Non-Technical Losses in Power Utilities—A Comprehensive Systematic Review. Energies. 2020; 13(18):4727.

Ponce-Jara MA, Ruiz E, Gil R, Sancristóbal E, Pérez-Molina C, Castro M. Smart Grid: Assessment of the past and present in developed and developing countries. Energy Strategy Reviews. 2017; 18:38-52.

Guerrero JI, Matos AP, Personal E, León C, Biscarri J, Biscarri F. Intelligent Information System as a Tool to Reach Unaproachable Goals for Inspectors: High-Performance Data Analysis for Reduction of Non-Technical Losses on Smart Grids. In: The Fifth International Conference on Intelligent Systems and Applications. 2016; 83-87.

Guerrero JI, Monedero I, Biscarri F, Biscarri J, Millan R, Leon C. Non-technical losses reduction by improving the inspections accuracy in a power utility. IEEE Transactions on Power Systems. 2018; 33(2):1209-1218.

Han SY, No J, Shin J, Joo Y. Conditional abnormality detection based on AMI data mining. IET Generation, Transmission & Distribution. 2016 Sep; 10(12):3010-3016.

Depuru SSSR, Wang L, Devabhaktuni V. Support vector machine-based data classification for detection of electricity theft. In: 2011 IEEE/PES Power Systems Conference and Exposition. 2011.

Júnior LAP, Ramos CCO, Rodrigues D, Pereira DR, Souza AND, Costa KAPD. Unsupervised Non-Technical Losses Identification through Optimum-Path Forest. Electric Power Systems Research. 2016; 140: 413-423.

Messinis GM, Hatziargyriou ND. Unsupervised Classification for Non-Technical Loss Detection. In: 2018 Power Systems Computation Conference (PSCC) 2018.

Messinis GM, Rigas AE, Hatziargyriou ND. A Hybrid Method for Non-Technical Loss Detection in Smart Distribution Grids. IEEE Transactions on Smart Grid. 2019; 10(6):6080-6091.

Buzau MM, Tejedor-Aguilera J, Cruz-Romero P, Gomez-Exposito A. Detection of Non-Technical Losses Using Smart Meter Data and Supervised Learning. IEEE Transactions on Smart Grid. 2019; 10(3):2661-2670.

Qu Z, Li H, Wang Y, Zhang J, Abu-Siada A, Yao Y. Detection of electricity theft behavior based on improved synthetic minority oversampling technique and random forest classifier. Energies. 2020; 13(8):2039.

Pereira J, Saraiva F. Convolutional neural network applied to detect electricity theft: A comparative study on unbalanced data handling techniques. International Journal of Electrical Power & Energy Systems. 2021; 131:107085.

Figueroa G, Chen YS, Avila N, Chu CC. Improved practices in machine learning algorithms for ntl detection with imbalanced data. In: 2017 IEEE Power & Energy Society General Meeting. 2017.

Hussain S, Mustafa MohdW, Jumani TA, Baloch SK, Alotaibi H, Khan I. A novel feature engineered-CatBoost- based supervised machine learning framework for electricity theft detection. Energy Reports. 2021; 7:4425-4436.

Avila NF, Figueroa G, Chu CC. NTL detection in electric distribution systems using the Maximal Overlap Discrete wavelet-packet transform and Random Undersampling Boosting. IEEE Transactions on Power Systems. 2018; 33(6):7171-7180.

Massaferro P, Marichal H, Martino MD, Santomauro F, Kosut JP, Fernandez A. Improving electricity non-technical losses detection including neighborhood information. In: 2018 IEEE Power & Energy Society General Meeting (PESGM). 2018.

Massaferro P, Martino JMD, Fernandez A. NTL detection: Overview of classic AND DNN-based approaches on a labeled dataset of 311k customers. In: 2021 IEEE Power & Energy Society Innovative Smart Grid Technologies Conference (ISGT). 2021.

Nagi J, Yap KS, Tiong SK, Ahmed SK, Mohammad AM. Detection of abnormalities and electricity theft using genetic Support Vector Machines. In: TENCON 2008 - 2008 IEEE Region 10 Conference. 2008.

Glauner P, Boechat A, Dolberg L, State R, Bettinger F, Rangoni Y. Large-scale detection of non-technical losses in imbalanced data sets. In: 2016 IEEE Power & Energy Society Innovative Smart Grid Technologies Conference (ISGT). 2016.

Glauner P, Meira JA, Dolberg L, State R, Bettinger F, Rangoni Y. Neighborhood features help detecting non-technical losses in Big Data Sets. In: Proceedings of the 3rd IEEE/ACM International Conference on Big Data Computing, Applications, and Technologies. 2016.

Meira JA, Glauner P, State R, Valtchev P, Dolberg L, Bettinger F. Distilling provider-independent data for general detection of non-technical losses. In: 2017 IEEE Power and Energy Conference at Illinois (PECI). 2017.

Ghori KM, Awais M, Khattak AS, Imran M, Fazal-E-Amin, Szathmary L. Treating Class Imbalance in Non-Technical Loss Detection: An Exploratory Analysis of a Real Dataset. IEEE Access. 2021; 9:98928-98938.

Lee J, Sun YG, Sim I, Kim SH, Kim DI, Kim JY. Non-Technical Loss Detection Using Deep Reinforcement Learning for Feature Cost Efficiency and Imbalanced Dataset. IEEE Access. 2022; 10:27084-27095.

González Rodríguez R, Jiménez Mares J, Quintero M. CG. Computational Intelligent Approaches for Non-Technical Losses Management of Electricity. Energies. 2020 May 11; 13(9):2393.

Coma-Puig B, Carmona J. A Human-in-the-Loop Approach Based on Explainability to Improve NTL Detection. In: 2021 International Conference on Data Mining Workshops (ICDMW). Auckland, New Zealand: IEEE; 943-950. https://ieeexplore.ieee.org/document/9679878/

Coma-Puig B, Carmona J. Non-technical losses detection in energy consumption focusing on energy recovery and explainability. Mach Learn. 2022; 111(2):487–517.

Han SY, No J, Shin J, Joo Y. Conditional Abnormality Detection Based on Ami Data Mining. IET Generation, Transmission & Distribution. 2016; 10(12):3010-6.

Saeed MS, Mustafa MW, Hamadneh NN, Alshammari NA, Sheikh UU, Jumani TA. Detection of Non-Technical Losses in Power Utilities-A Comprehensive Systematic Review. Energies. 2020; 13(18):4727.

Glauner P, Meira JA, Valtchev P, State R, Bettinger F. The Challenge of Non-Technical Loss Detection Using Artificial Intelligence: A Survey. International Journal of Computational Intelligence Systems. 2017; 10(1):760.

Kanter JM, Veeramachaneni K. Deep feature synthesis: Towards automating data science endeavors. In: 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA). 2015.

Al-Turaiki I, Altwaijry N. A Convolutional Neural Network for Improved Anomaly-Based Network Intrusion Detection. Big Data. 2021 Jun;9(3):233-252. doi: 10.1089/big.2020.0263. PMID: 34138657; PMCID: PMC8233218.

Christ M, Braun N, Neuffer J, Kempa-Liehr AW. Time series feature extraction on basis of scalable hypothesis tests (tsfresh – a python package. Neuro Computing. 2018; 307:72-77.

Xu J. Research on Power Load Forecasting based on Machine Learning. In: 2020 7th International Forum on Electrical Engineering and Automation (IFEEA). 2020; 562-567.

Khan S, Bhardwaj S. Time series forecasting of Gold Prices. In: Advances in Intelligent Systems and Computing. 2018; 63-71.

Ilieva R, Angelov M. Template for Building Manageable Data Mining Autonomous Process with RapidMiner Studio. In: 2021 XXX International Scientific Conference Electronics (ET). Sozopol, Bulgaria: IEEE; 2021; 1-5. https://ieeexplore.ieee.org/document/9580103/

Dossin E, Martin E, Diana P, Castellon A, Monge A, Pospisil P, Bentley M, Guy PA. Prediction Models of Retention Indices for Increased Confidence in Structural Elucidation during Complex Matrix Analysis: Application to Gas Chromatography Coupled with High-Resolution Mass Spectrometry. Anal Chem. 2016 Aug 2;88(15):7539-47. doi: 10.1021/acs.analchem.6b00868. Epub 2016 Jul 22. PMID: 27403731.

Marzukhi S, Awang N, Alsagoff SN, Mohamed H. RapidMiner and Machine Learning Techniques for Classifying Aircraft Data. Journal of Physics: Conference Series. 2021; 1997(1):012012.

Hosmer DW, Lemeshow S, Sturdivant RX. Assessing the fit of the model. Applied Logistic Regression. 2013; 153-225.

LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015 May 28;521(7553):436-44. doi: 10.1038/nature14539. PMID: 26017442.

Nelder JA, Wedderburn RW. Generalized linear models. 1992; 547-563. (Springer Series in Statistics).

Joshi AV. Deep Learning. Machine Learning and Artificial Intelligence. 2019; 117-126.

Arunadevi J, Ramya S, Raja MR. A study of classification algorithms using Rapidminer. International Journal of Pure and Applied Mathematics. 2018; 119(12):15977-15988.

Truong VH, Vu QV, Thai HT, Ha MH. A robust method for safety evaluation of steel trusses using Gradient Tree Boosting algorithm. Advances in Engineering Software. 2020; 147:102825.

Friedman JH. Stochastic gradient boosting. Computational Statistics & Data Analysis. 2002; 38(4):367-378.

Frank E, Trigg L, Holmes G, Witten IH. Machine Learning. 2000; 41(1):5-25.

Zhang Y, Ma Y. Application of supervised machine learning algorithms in the classification of sagittal gait patterns of cerebral palsy children with spastic diplegia. Comput Biol Med. 2019 Mar;106:33-39. doi: 10.1016/j.compbiomed.2019.01.009. Epub 2019 Jan 16. PMID: 30665140.