With the election being called as we speak, Jeffery and I have decided to accelerate our plans to stop treating our stuff as a competitive Sekrit and to provide more access for those that want it.
To this end, over the next couple of days we’re planning to release the following.
1. Public access to our SVN repository.
This contains all our source code and all the government source data we ETL from, plus other data maintained by us and others.
Doing an svn checkout of this will consume around 2 gigabytes of your internet quota, and around 4 gig of diskspace. This is by no means finished, and we expect this to double or triple over time.
2. The N4DB superschema
This is the master database behind all our stuff that is compiled from the 4 gigabytes of stuff in the repository. It has a unified data model which supports all the different types of political information that we track.
The 1.2-ish gig database is based on Postgres 8 + PostGIS 1.4, and the current version contains the following.
- All Australian Jurisdictions
- All Australian Houses
- Every current division at all levels of government, including boundaries
- The 2010 election divisions, including boundaries
- A selection of local government mayors for NSW
- The digitial crime mapping boundaries for inner sydney used
- Foreign keys from all members to Open Australia ids and MyMP ids, where available
- The AEC’s list of all polling places as of 2007, including GPS positions
- The entire 2006 census taxonomy, plus boundaries for all collection districts
All of these tables have proper references to each other, and you can do SQL joins across all the above concepts.
Please note that this database does have some errors in it, and will never really be “finished”. It is, however, temporally safe. It can support storage of not just all current political information, but also all historical information.
Our intention with this is to provide a central readonly database that compiles together the core set of government data release by them to the public. You can make your own geo2gov style SQL queries directly, and/or do any kind of more complex query that you wish. You can also use it as a starting point for building your own applications around.
3. Downloadable geo2gov virtual machine
The current public service is hosted on Amazon, and doesn’t really support large scale bulk usage as well as it should. However, we also plan to release the same virtual machine we use for the public site as a downloadable appliance, that you can drop into your own infrastructure. This VM is also handy because it contains the database mentioned above, with postgres and postgis already pre-installed and the database already loaded.
So you can just log into the VM, and then connect to the postgres database on localhost and do whatever you want. Because it’s PostGIS, you should also be able to connect desktop geo software (Quantum, ARC, etc etc) directly to it and do arbitrary visualisation stuff as well.
In summary, although this is a little sooner than I had hoped (Would have been nice to have to write proper documentation and get it all publically polished) I think timeliness is now more important.
Further notifications to come as each bit is ready for public consumption.