Benford’s law is a statement that does not always come into research, the application of Benford’s law is usually seen in research in the accounting and banking and finance department. The Benford’s law came into existence as a means in detecting accounting fraud in the accounting sector or bank frauds in the banking sector. Most undergraduate students always make this mistake of using methodology for quantitative survey, qualitative survey as the methodology for the application of Benford’s law; well the methodology for the application of Benford’s law is totally different from the two although it borrowed a little from the quantitative survey methodology such research design etc. I am writing this article hoping it help most final year students, in their research work. The article will not discuss how to use the Benford’s law in detecting fraud but it will give you the methodology for the application of the Benford’s law in detecting fraud. Below is the format or the methodology for the application of Benford’s law in detecting fraud:
RESEARCH METHODOLOGY
3.0 PREAMBLE
This chapter describes the various methods to apply in order to achieve the research objectives, Benford’s law can be used as a test of the honesty or validity of purportedly random scientific data in a social science context, and it wasn’t picked up by accountants until the late 1980s. At that time, two studies relied on digital analysis to detect earnings manipulation. Carslaw (1988) found that earnings numbers from New Zealand firms did not conform to the expected distribution.
3.1 RESEARCH DESIGN
This research work employed the Benford’s to detect accounting fraud in the financial sector. The Benford’s law was considered because of its efficiency and strength over the role of auditors; this type of research design for this study is exploratory and it is conducted because a problem has not been clearly defined. It helps to determine the best research design, data collection method and selection of subjects.
Primary ways which are Literature Research, talking to experts in the area of study and interview method were also employed
3.2 DATA COLLECTION METHOD
This study utilizes the secondary data containing the financial statement of selected companies for a period of two years so as to be able to compare the result of last year to that of the present; the data for fraud of some selected companies in the financial sector will be collected from the CBN statistical bulletin 2015.
3.3 DATA ANALYSIS
Data analysis overtime is one of the best ways of getting accurate information to enhance decision making in a research work. According to Saunders et al (2000) defines data analysis as consisting of three concurrent flows of activity that is data reduction, data display and a conclusion drawing/verification part.
Various analytical tools and software such as pie charts, tables, and excel will be used in analyzing data for this study.
Data collected will be analyzed using frequencies and percentages. These frequencies and percentages will enable the researcher to clearly represent true data characteristics and findings with a great deal of accuracy. The unusual trends will be highlighted, if there is any, to find out about the anomalies. A chi-square test will be carried out on the data set to test the ‘’goodness of fit’’ of the data. Interpretation and analysis of data will also be used to describe items in tables used for this study.
3.3.1 The Chi- Square Test
The observed and expected frequencies for each major, we computes the difference between them [O – E], squares O – E [(O – E) 2], divides the squares by the expected frequencies [(O – E) 2/ E], and sums those quantities to give us our x2.
Where: x2 is the value for chi-square,
Σ is the sum
Ο is the observed frequency
Ε is the expected frequency,
3.3.2 TO TEST LEAD DIGITS USING BENFORD’S LAW EXCEL
Step 1: Select the sample Data
The first task is to obtain simple test data and store them in an Excel spreadsheet- the more observations included, the better.
Step 2: Parse the Lead Digit
As noted previously, Benford’s Law focuses on the lead digit in sets of naturally occurring numbers. The actual magnitude of the data (i.e., whether an amount is N 10, N 100 or N 1,000) is unimportant. In a spreadsheet, one can select or ‘’parse’’ the lead digit for each amount, using Excel’s LEFT formula. The general form of this formula is:
=LEFT (Data item, Number of characters)
Here, the term ‘’Data item’’ is a cell reference and ‘’ Number of characters’’ indicates how many characters to parse (starting from the left side of the name or number).
Step 3: Create a Frequency Distribution
The next step is to create a frequency distribution of the lead digits that have been parsed from the sample data. To do this, the headings of the table shown on the right side of should be created, and numbers ‘’1,’’’’2,’’….,’’9’’ should be stored in the first column under the heading ‘’Digit’’.
Step 4: Compute the Expected Distribution
Benford’s Law predicts that approximately 30.1% of lead digits will be a 1, 17.6% of the lead digits will be a 2, and so forth.
Step 5: Plot the Results
Now there are two sets of values- the actual distribution of the lead digits from the sample and the theoretical distribution of such digits as dictated by Benford’s Law. What one wants to know is how well these distributions match. One wat to answer this question is to plot these two sets of data and observe the results. To perform this task one can use Excel’s charting tools and create a bar graph like the one in the insert portion.
Step 6: Perform a Chi- square Test
Even if the sample data do not graphically match the expected values very well, the question remains ‘’how far off are they?’’ To answer this question statistically, auditors can use Excel’s CHITEST function- a chi-square test – to provide some guidance. The chi-square test is a ‘’ goodness- of-fit’’ test, i.e., a statistical test that measures how well the data distribution from a sample matches a hypothetical distribution dictated by theory.
Excel CHITEST has a general form:
=CHITEST(Data range of actual values, Data range of expected values)
In this formula, the Data range of actual values reflects the values derived from a sample, while the Data range of expected values shows the expected values dictated by the theoretical distribution.