Australian Open Data Criminology Project


Different types of crime have different causes and different factors influencing their incidence. Traffic crimes tend to revolve around proactive actions by police - breathalising, setting up speed cameras and behaviour of the public - seasonal influx of visitors through an area, for example.

Policy choices affect crime rates. Regardless of the category of offence, not all crimes are reported to, or discovered by police. If there is a crackdown on a particular type of crime then those rates will rise simply because more are discovered rather than more people are commiting that crime. An interesting example is the 'Transport Regulatory Offences' for the Sydney LGA. It is interesting to speculate on the policy deicision that might account for the sharp rise in this offence type.

This project began as a way to inform my submission into the Senate Inquiry into the Cashless Debit Card Bill 2017. The Cashless Debit Card is a means of delivering 'blanket' income management to people in receipt of working age income support payments (anyone on a 'welfare' payment who is not on Age Pension). The term blanket is used to differentiate the scheme from targeted income management where individuals are selected for intervention due to demonstrated issues with managing their finances.

The Cashless Debit Card (CDC) Bill aims to expand income management to all persons on welfare regardless of their capacity to manage their affairs or the needs of the individual or family.

In order to 'manage' income, the CDC forces families to have separate cards and accounts even where that is not appropriate (for example where a person is caring for an invalid spouse or relative), and shares all purchasing data with the government for purposes of surveillance.

These are signifcant breaches of the human right to a private life and family. To avoid beaching the human rights treaties Australia is a signatory to, the government has to justify the removal of human rights from a large section of the population in the Explanatory Memorandum of the Bill. More on this.

The argument the government has made to justify the blanket removal of human rights is to quote crime statistics for remote Indigenous communities, and claim the benefits of the CDC are so powerfully positive that they over-ride the removal of rights - hence the interest in crime data.

In my submission I argued that this justification only works (if at all) in areas where there are extreme crime rates and significant social harm flowing from these. ie it can not be used to argue for the removal of rights in areas where the issues justifying it do not exist in the same extreme way.

Given that crime data is often used in social policy, I decided to invest some time in establishing a public facing open data project using crime data.

The project provides transparency on the data being relied on in policy and the issues that affect the data. Other benefits are in educating the public (and polciy makers) and the media about the issues that affect crime rates.

Projects like this also demonstrate the limitiations and potential of open data and provide feedback to data custodians on what the community needs.

Data quality varies by data set and availability varies significantly between jurisdictions.

Data quality refers to whether the data itself is accurate and complete according to what the publisher/data custodian intends to be in the dataset.

Data availability refers to whether a data set is available at all, whether it contains the same granularity or breakdown, the time period it covers and how often it is updated.

In terms of data quality, it is not unusual for a dataset published by a specific jurisdiction to be missing data for some jurisdictions.

In terms of availability, while some jurisdictions publish open crime data, others do not. In some instances, data is transformed by myself (ie algorithms are applied to it). These algorithms can contain errors.

Sometimes signifcant transformations have to be applied to the data to make it usable. This can introduce errors. These types of issues will improve greatly over time as the data and processes are checked for accuracy.

A major issue affecting availablity is the administrative boundary applied to the open data sets. QPS publishes crime, incidence, victim & offencer data by police District, Division, Region & State but not by LGA or postcode. These datasets in themselves cover differing time periods.

NSW police publishes only crime incidence data (not crime rates, victim or offencer) data by LGA and postcode.

With the exceptin of ABS data, open data is typically published sans any documentation. It can be a trial by error process to figure out what each column refers to, what it includes or excludes etc.

Different jurisdictions have different criminal law which while broadly similar, offences vary between jurisdictions. Granularity also differs between jurisdictions. NSW breaks down assault data into domestic violence, non-domestic violence related and other. QLD does not but does have a separate category of 'Breach domestic violence order' (assault is a separate offence).

What this means for the user is that just becuase you can find out a particular result in one jurisdiction, this does not mean it will be available for comparison in other jurisdictions.

For architects and developers of this project, it requires a piecemeal approach be taken based on the data availble in each jurisdiction and it's characteristics. Consistency between jurisdictions needs to be balanced against what is actually available to feed into the project.

There's quite a bit of interest in this project and suggestions for expansion are already under way and being implemented as time and data permits. It's worth remembering that not only do I have to find the data and create the functionality but I also have to then make whatever I create user-friendly and mobile ready which can be a real challenge with data projects.

I expect this interest to continue, and as time permits I will open the project up more as I am able to do so.

To interact (with myself and others) about this project on FaceBook, like the Australian Open Data Criminology Project FB page. Posts are moderated but please contribute data sources, news stories, links to other projects, etc.

To interact about this project on Twitter, follow me and use the project-specific hashtag #AusCrimeData. Check out #crimedata to see what other jurisdictions around the world are doing with crime data. The project-specific hashtag is useful becuase it allows myself and others to find your posts at a later date in case we need to refer back to them.

If you're not on FB or Twitter you can email me at

under construction

Current functionality

You can rank all administrative areas against one another on each individual offence. Select an offence from the drop down menu to get results which rank all administrative areas against one another on both actual offence numbers and rates.

You can click the administrative area in the resulting data tables to chart crime rate and offences for that offence and area- opens on a new page.

Available jurisdictions and boundaries

JurisdictionAdministrative Area(s)
QLDQPS District, QPS Division
NSWNSW LGA (pre amalgamation)

This page has the functionality to chart actual offence number by month for available years for each offence by administrative area. Mouse over the chart to see the monthly offence number and crime rate and select a view which will change the width of the chart to provide more nuanced data.

Available jurisdictions

JurisdictionData TypeTimespan
QLDOffence Number & Crime Rate1997-2017
NSWOffence Number & Crime Rate1995-2017

This page has the functionality to chart crime rate data from January 1995 to December 2016 for the state. You get a birds-eye view of the trend for each offence and offence group for the years available.

Available jurisdictions

JurisdictionVictim DataTimespan
QLDVictim breakdown by Gender1997-2017
NSWNo victim data1995-2017