Data mining application to healthcare fraud detection: a two-step unsupervised clustering method for outlier detection with administrative databases
Being the recipient for huge public and private investments, the healthcare sector results to be an interesting target for fraudsters. Nowadays, the availability of a great amount of data makes it possible to tackle this issue with the adoption of data mining techniques. This approach can provide more efficient control of processes in terms of costs and time compared to manual audits.
This research has the objective of developing a novel data mining model devoted to fraud detection among hospitals. In particular, it is focused on the DRG upcoding practice, i.e. the tendency of coding within Hospital Discharge Charts (HDC) in Administrative Databases, codes for provided services and inpatients health status so to make the hospitalization fall within a more remunerative DRG class.
The model here proposed is constituted by two steps: one first step entails the clustering of providers according to their characteristics and behavior in the treatment of a specific disease, in order to spot outliers within this groups of peers; in the second step, a cross-validation is performed. This second phase is useful for controllers to verify whether within the list of suspects identified in the first step, any hospital exists, which may be justified in its outlierness by its particular characteristics, or by the treatment of a more complex patients’ base.
The proposed model was tested on a database relative to HDC collected by Regione Lombardia (Italy) in a time period of three years (2013-2015), focusing on the treatment of heart failure.