Analysis of Crime Data in the Limpopo Province

South Africa has a very high rate of murders, assaults, rapes and other crimes compared to most countries. Most South Africans who immigrate to other countries cite crime as the major reason. Crime has become a concern for all, the police, private security industry, real estate developer, car manufacturers, businessmen, etc. There is a high demand for crime prevention; and this calls for a continuous use of new, advanced and reliable methods to prevent crime. How bad is the level of crime in Limpopo and what are the major crime types? This study uses secondary data from the 2011 Census conducted by Statistics South Africa; and tries to look at the composition of crime in the province and the variables that influence crime, in order to propose measures to tackle and minimize crime in the province. Multivariate statistical analysis has been employed, and the study shows that the following variables; the population size, number of households, youth unemployment, growth rate, and dependency ratio; have a positive influence in the occurrence of crime. The study recommends the slowing down of population growth rate, decrease in household size and the reduction of youth unemployment to curb crime in the province.


Introduction
Crime is rampant and very high in South Africa. The rates of murder, rape (adult, child and infant) and assault, for example, are very high compared to those of most countries. Most South Africans who immigrate to other countries cite crime as the biggest factor that influences their decision to leave the country. Crime rate is often considered to be extremely important index to judge the welfare and the quality of living within a particular area. People value the safety issue when they move or decide to purchase a real estate and maybe to relocate for better career opportunities. We can already imagine many different causes of crime and in fact many studies have been conducted all over the world to strategize how to bring down criminal activities. It is a constant endeavour of governments and policing organizations all over the world to bring down crime rates so that the world becomes a safer place to live in. The fight against crime is not a new one to humanity and security services have, since the establishment of society, tried to bring crimes down. The South African Police Service is responsible for managing 1115 police stations across the country which is fighting against crime in the country (Statistics South Africa, 2011.This study intends to look at the composition of crime in the Limpopo province and the variables that influence crime, in order to propose measures to minimizing crime in the province. The study aims not only to make the police departments have a sense of dangerous municipalities in order to pay more attention to them, but also to assist the local governments in finding out what variables that they need to manipulate in order to make the province a better and safer place to live in.

Literature Review
Crime is a global crisis affecting almost all countries but with different intensity and tempo. The causes of crime have been the investigative topic of many social researchers in the past, and they have been identified to be similar across the world, although they differ from place to place (Simpson, 1998).The current economic turmoil has led governments at all levels to reflect on their approaches to fight crime. According to studies by the Institute for Security Studies (ISS) in 2001, crime in South Africa began to increase in the mid-1980s. Since the 1990s and especially since 1996, the increase has been dramatic. Violent crimes increased at a greater rate than the rest of the crimes. The trends of crime in South Africa's major cities have kept on increasing since 1994. When we compare crime rates among cities, Johannesburg has the highest volume of serious crime, followed by Pretoria, Cape Town and Durban. Crime levels in all these urban centres with the exception of Johannesburg increased between 1994and 1999(ISS, 2001.Youth unemployment, poverty and proliferation of guns are cited as major contributors of crime. But according to the Centre for the Study of Violence and Reconciliation' survey (CSVR) in 2006, unemployment was regarded as the greatest concern(33%) with crime in second place (30%), HIV/AIDS in third (15%) and poverty in fourth place (9%). On the other hand, among people in the highest income group (household income of more than R8 000 per month (US$600)), however, crime was regarded as the highest priority among 42%, with unemployment regarded as the highest priority by 24% (CSVR, 2008(CSVR, , 2009).
The rate of murder is increasing. For the first time in 20 years the number of murders and the murder rate has increased for a second consecutive year (CSVR, 2009;Statistics South Africa, 2013).The murder rate is regarded as one indicator of a country's stability -the higher it is, the less stable a country is regarded to be.
[The murder rate refers to the number of people who are murdered per 100 000 of the population.]A recent editorial in the Palm Beach Post stated that pressure is on the Criminal Justice Commission (CJC) to demonstrate that cherished programmes, such as the youth empowerment centres, are effective in reducing crime (International Centre for the Prevention of Crime, (ICPC) 2010). Crime prevention implies programmes to train and incapacitate young people with some skills because an adage goes like: "an idle hand causes damage". Crime prevention has been described as "any initiative or policy which reduces, avoids or eliminates victimization by crime or violence (Homel, 2005;CSVR, 2009). It includes governmental and nongovernmental initiatives to reduce the fear of crime as well as lessen the impact of crime on victims" (ICPC, 2010).
Crime can be prevented through social prevention, situational crime prevention and legal sanction strategies, among others. Social prevention involves neighbours forming organised watch-dog programmes; situational crime prevention involves measures that tighten access control and surveillance making the opportunities for criminals to work extremely hard, and the risk for them to be apprehended equally very high. Legal sanctions, on the other hand, aim at incarcerating offenders with long-term sentences to serve as a deterrent effect (Schlossman et al., 1984;Clarke, 1977;Homel, 2005). There is clear evidence that well-planned crime prevention strategies prevent crime and victimization, and equally promote community safety and contribute to the sustainable development of countries. Effective crime prevention enhances the quality of life of all citizens. It has long-term benefits in terms of reducing the cost associated with the formal criminal justice system, as well as other social costs that result from crime. Crime prevention offers opportunities for a humane and more cost-effective approach to the problems of crime (Clarke and Homel, 1997;UN, 2002).

Methodology
Data based on 13 variables that fall under several general characteristics and categories such as population characteristics, economic characteristics, social characteristics, and housing characteristics all from the 2011 Census have been used (Statistics SA, 2012.The variables are: population size (POP), population density (POPDEN), number of households (NOH), unemployment rate (UMPR), youth unemployment rate (YUMPR), growth rate (GRTR), youth (YTH), Working age (WKA), elderly (ELDERLY), sex ratio (SRT), average household size (AHS),female headed households (FHH) and dependency ratio (DPR).The dataset totally has 25 entries; each entry represents information of a particular local municipality in the Limpopo province. Sampling adequacy: Before using any statistical procedure, there is always a need to test the relevance of such procedure for the data in question. Factor Analysis (FA) or precisely, Principal Component Analysis (PCA) has been used in this study therefore Table 2 tests its suitability. Table 2 shows the Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy and the Bartlett's test of sphericity. The KMO statistic varies between 0 and 1. When it is zero then the sum of partial correlations is large relative to the sum of correlations, which means there is some diffusion in the pattern of correlations and therefore the data is likely to be inappropriate for Factor Analysis (Everitt and Dunn, 2001;Everitt and Hothorn, 2010). .000 On the other hand when KMO is close to 1 then the patterns of correlations are relatively compact and so Factor Analysis should yield distinct and reliable factors. It is recommended that values greater than 0.5 are acceptable. Values between 0.5 and 0.7 are mediocre, between 0.7 and 0.8 are good, between 0.8 and 0.9 are great and those greater than 0.9 are superb (see Hutcheson and Sofronion, 1999, pp. 224-225).For these data, the value is 0.730, which falls into the range of being good, so we should be confident that FA or more precisely, PCA is appropriate to be used. Distribution assumptions' verification is crucial before doing any statistical modelling. Multivariate normal distribution is one of the most frequently made distributional assumptions when using multivariate statistical analysis or techniques like Principal Component Analysis and Discriminant Analysis. We know that if X=(X1, X2, …, Xp) follows the multivariate normal distribution, then its individual components X1, X2, …, Xp are all normally distributed. Therefore, we need to examine normality of each Xi to guarantee that X=(X1, X2, …, Xp) is normally distributed (Everitt and Hothorn, 2010;Johnson and Wichern, 2014).

Table 3 New Variables which are after transformation
Here, the quantile -quantile plot (QQ plot) has been used to assess normality of data. In QQ plot, we compare the real standardized values of the variables against the standard normal distribution. The correlation between the sample data and normal quantiles measures how well the data is modelled by normal distribution. For normal data, the points plotted should fall approximately on a straight line in the QQ plot. If not, data transformation is applied to make the data to appear more closely normally distributed. For transformation, we can use a variety of methods including the logarithm, square root, power transformation, and/or scale function (Bartholomew et al., 2008).Our QQ plot of each variable revealed that all the 14 variables do not follow normal distribution. We then try different forms of transformation on all the variables to obtain the substitute variables which perform better on normality.    Figure 2 shows the multicollinearity between the variables. Therefore, we introduce Principal Component Analysis in order to rotate the variables matrix X=(X1, X2, …,Xp) to achieve orthogonality to decipher patterns more easily and at the same time reduce the dimension of the data for simpler data process (Everitt and Hothorn, 2010). With the assistance of PCA, we obtain a clear pattern of our municipalities profile data, without much loss of variables to catch up with the whole data information. The following tables and figures give the results from the principal component analysis. Only two factors were retained, the first could be called ratio because it consists of rates or percentages and the second factor called size. Table 4 shows that factor 1 explains 66.9% of the variance while factor 2 explains 18.6% and the two factors account for over 85% of the variation. Figure 3 shows the scree plot which proves that indeed only two factors are eligible to be retained.

Figure 3: Screen plot
Earlier on, we mentioned three methods which can be used to determine the number of factors (PCs) to retain; above is the Kaiser's criterion which says we should retain only the factors (PCs) with eigenvalues greater than 1. Figure 3 is a scree plot, the largest change in the slope occurs on PC3, which is the "elbow" of the plot, thus we should retain the first two PCs (Everitt and Hothorn, 2010;Johnson and Wichern, 2014). Table 5 shows that the variables that load highly onto factor1 are WKA%, DPR, YTH%, FHH, AHS, UMPR%, YUMPR%, SEX RATIO, ELDERLY% and GRTR%. We call factor1 ratio; whereas, the variables that load highly onto factor2 are NOH, POP and POPDEN. We call factor 2, size.   Figure 4 shows that GRTR%, WKA% and SEX RATIO are negatively correlated to the rest of the variables that load onto factor1 (ratio) with WKA% and SEXRATIO also negatively correlated to GRTR%. Table 6 shows the extraction method from PCA. Discriminant Analysis (DA) on Principal Components: Since our PCs capture the majority of the information from the original data, and at the same time, the number of PCs are much less than that of original variables, it is easy to think of using the factors computed from PCA as input factors for DA algorithm. The new factors which are a linear combination of the original variables have the following advantageous properties:

Interpretation:
 The interpretation of them allows us to detect patterns in the initial data space.  Reduce very large factors into a smaller number of factors; hence we can remove noise from the dataset by using only the most relevant factors.  Algorithms such as Linear Discriminant Analysis (LDA) could have a better behaviour because PCs come from an algorithm basis.  Principal Components Analysis is successfully applied into our dataset by extracting two PCs out of the 13 original variables, which implies a great dimensional reduction. In addition, these PCs account for 85.497% variance of the original dataset, thus we did not lose much information. As mentioned earlier Factor1 consists of WKA%, DPR, YTH%, FHH, AHS, UMPR%, YUMPR%, SEXRATIO, ELDERLY and GRTR% which are highly correlated with each other; and we call that Factor1 the ratio. The Component Plot shows that GRTR%, WKA% and SEXRATIO are negatively correlated to the rest of the variables in factor1 with WKA% and SEXRATIO also negatively correlated to GRTR%.  Factor2 consists of NOH, POP and POPDEN, we call Factor2 the size. These two factors can be used to classify a municipality as a safe one or unsafe one.
 Discriminant Analysis is applied to classify a Local Municipality by the composition of crime even if there are no data records of crime. For our case, we applied Linear Discriminant Analysis on the original variables and to the Principal Components respectively (Bian, 2005).
 We used the unstandardized coefficients to construct the actual prediction equation which can be used to classify new cases (Johnson & Wichern, 2007). Our analysis gives the following model: = -1.877 + 2.981factor1 -0.012factor2 +Ԑ From the model we can see that to minimize the value of , we have to increase more of factor2 which is size and minimize factor1 which is ratio Discussion: The foregoing analysis shows that population size influences crime. Population increase has to be considered seriously because it triggers a quantum effect on the society with negative consequence. It leads to the creation of more people with some form of frustration or resentment towards the society such that they end up engaging in criminal activities, (Statistics SA, 2011. Increased population leads to congestion (excess population), competition and jealousy; and this is one of the biggest causes of crime and much of the challenges that the world faces. Reduction in the number of households and population density are seen as important factors to fight and bring crime down. Sex ratio is seen to be negatively related to elderly (advanced in age) because when people advance in age, usually the men die faster than the women (their wives) therefore decreasing the sex ratio. The questions again are, what factors may be related to safety? Are there some variables that we can observe to predict the safety and security of communities? What are the strategies needed to reduce crime rates? These are some of the questions that have prompted this study.

Conclusion
The study shows that several variables have an influence on crime. Notably, the population size, population density, number of households, unemployment rate, youth unemployment rate, growth rate, and dependency ratio have a positive influence in the prevalence of crime. These variables are all related to the population size of a particular place. Crime increases when population increases, therefore the local authorities should take note of that. Excess population results in high youth unemployment in a shrinking economy like South Africa's, therefore to control crime in the Limpopo Province in particular and in the country as a whole, local, provincial and national governments have an obligation to fight crime by creating job opportunities to reduce unemployment, especially that of youth, and also provide educational facilities to the youth to equip them with skills so that they can get employments or they can be busy doing some trade. The government can also fight crime by increasing the severity of punishment for offenders. Provide educational centres where people are taught about crime and how they can protect themselves from being victimized.
Limitation of the study: The variables chosen are based on the historical study of possible factors causing crime, besides the availability and accessibility of the variables. If data resources were sufficient, additional variables such as alcohol-drug consumption, educational factors and other factors could have been considered in this analysis.