[ZOIS] Home Page * Contact ZOIS * Technical Notes

Jobcentre Plus Mirror Data Definitions

ZOIS Technical Note TN-2010-12-01.

Author and Audience

An Application Programming Interface has been presented that will allow web-based programs to access the Jobcentre Plus Mirror database[jp]. This TN concerns itself with describing the various data fields used by that API. The audience should be familiar with programming and database techniques and wishing to add value to the data provided. Written by Martin Sullivan[au], ZOIS Limited, Cockermouth.

Abstract

A series of possibly vague definitions is given on the data-items found within the Jobcentre Plus Mirror Database. This database is an unofficially collected set of vacancy details found at the UK government sponsored Jobseekers Direct web-site.

Introduction

The Jobcentre Plus Mirror[jp] is an indexed database held at ZOIS. It is the result of scraping the Jobcentre Plus and latterly the Jobs direct web-sites. This data is presented as an FTP-able file in Comma Separated Value form on a nightly bases[ft]. This work has now been augmented by a series of interfaces presented with the intention of being called as a Value Added Service from other web-sites.

Materials and Platform

The underlying database is PostgreSQL[pg]. The types therefore reflect the types used in this database. These are, in summary:

text
Variable length unlimited character strings.
char, char(n)
Single character or fixed-length string, blank padded.
date
Is a date type with a resolution of one day. Dates will be generally transferred in ISO-8601 format (1957-01-30, as an example). The calendar is the one used in the UK. As the underlying system is PostgreSQL, a number of input values can be parsed into a meaningful internal date. For example 'today' and 'yesterday' as well as '30 January 1957'. Care must be taken lest a invalid syntax error occur, and ISO-8601 is preferred.
timestamp
This is a date type with a resolution of 1 second (although the value is held higher precision internally). ISO-8601 format is preferred once again, so midnight on the above date is 1957-01-30 00:00:00)

Method

Much of this is designed to be human parsable. It is thus quite vague and contains little standard, coded information.

title
Summary of vacancy text
reference
LMS reference, expected to be unique. The reference is provided by the Labour Market System, and underlying database of vacancies held by the Department of Work and Pensions (or their subcontractors on their behalf). This database forms the core of the Jobseekers Direct web-site and subsequently our scraping efforts. It consists of three letters identifying an 'owning' Jobcentre Plus Office, a slash ('/') and a serial number. The LMS reference is held to be unique and is treated as such by the JCPM. text
location
Short description of location, can contain postcode data. text
hours
Working hours description, particularly for part-time and split-shifts. text
wage
Short description of payment. No easily parsable text. text
work_pattern
Largely blank, but contains part-time details. text
employer
Sometime obfuscated employer name. text
employer_ref
A largely numeric reference to employer. Limited value. text
pension
Appears to either be blank or "Pension available", text
duration
Permanent (P) or Temporary (T), where noted. The official policy seems to be that any contract of employment with a term greater than six-months is Permanent. char enumerates (P|T)
closing_date
Noted closing date, if any. date
description
Detailed description of the vacancy. text
apply
Application details. text
added
Official provided posting date, if not provided then the current date is used. date

In addition there are a number of additional fields that are derived from the above.

office_code
Three letter office code, derived from reference. char(3).
summary
A truncated variation of the description with standardised Boiler Plate text describing Local Enterprise Partnerships and so forth removed. The summary is subsequently truncated to 200 chars and is designed to be used in extended search results and so forth. text
noted
The time the vacancy was noted by the JCPM system. timestamp
pattern
A PostgreSQL full-text search[t2] pattern comprising a series of key words optionally interspersed with optional logical controls. For example:
plumber 
plumber AND 'central heating'
plumber OR carpenter 

Discussion

This data description is designed to be used in conjunction with the Jobcentre Plus Mirror Application Programming Interface[ap]. This system is documented both in its initial 'index' page and elsewhere in other Technical Notes.

As with other Technical Notes, feedback is actively solicited. The author may be contacted via the e-mail address found on his public biography page[au]. Should something require changing or enhancing then the fact will be acknowledged with attribution in an Update section.

References

References found in this section, and in particular the HTML links were correct at time of writing (2010-11-30).

[au]. Martin Sullivan:
http://www.zois.co.uk/people/martin_sullivan
[jp]. Jobcentre Plus Mirror Database:
http://home.zois.co.uk/jcpnational.html
[ft]. Jobcentre Plus Mirror FTP site:
ftp://ftp.zois.co.uk/pub/jcp
[pg]. PostgreSQL
http://www.postgresql.org
[t2]. Tsearch2:
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2
[ap]. The JSON Application Programming Interface to the Jobcentre Plus Mirror:
http://home.zois.co.uk:591

~Z~


Date: 2010-12-01


Break Frame * E-mail Webmaster * Copyright