Abstract
Accounting fraud is the intentional material misstatement of financial statements or disclosures (in notes to financial statements or SEC filings), or the commission of an illegal act that has a direct material effect on financial statements or financial disclosures. Accounting fraud reduces public confidence in capital markets and impedes economic development. Studies are underway to develop a model for detecting accounting fraud using the descriptions in Form 10-K and other reports. This is because researchers believe that by identifying linguistic patterns that mask accounting fraud, a fraud-detection model with high detection accuracy can be built.
This study examines the linguistic characteristics that occur when accounting fraud is committed within the text of the MD&A section of an annual securities report, constructs a fraud-detection model, and evaluates its effectiveness. Based on interpersonal deception theory, we hypothesize that during accounting fraud periods, managers do not refer to their company's business conditions in detail and use more general language, thus reducing word specificity.
The results of measuring vocabulary specificity using Inverse Document Frequency (IDF), which calculates word scarcity, demonstrated that IDF was lower during the accounting fraud period than during the non-fraudulent accounting period. This indicated a significant decrease in vocabulary specificity. Based on these results, we constructed a model for accounting fraud detection using IDF-weighted features and found that the detection accuracy improved.