Timely Notification using Micro-Blogging
ZOIS Technical Note TN-2012-01-01.
Author and Audience
Experiences of attempting to leverage Twitter to
allow timely updates of job vacancy data are described. These may be
of interest to others using targeted broadcasting using Social
Media. Knowledge of Twitter and programming techniques are
assumed. Written by Martin Sullivan[au],
ZOIS Limited, Cockermouth.
A series of ultimately unsuccessfully experiments were undertaken
with the aim of leveraging the micro-blogging web-site Twitter to
allow the broadcast of Jobcentre Plus Mirror Database information.
The Micro-blogging site Twitter[tw] allows small 140-character long messages to be posted, and read by people who've subscribed to them as a feed. They can also have these messages selectively delivered to Mobile Phones using the SMS system. Information can therefore be delivered in a timely, targeted way to suitably equipped often mobile devices. Further, users may selectively promote messages by re-sending them to those who subscribe to their own list. Twitter offers a geographic association system allowing users to examine message targeted to a specific area.
Messages, although short, can have URLs embedded within them, generally using a URL shortening service, so that lengthy ones do not eat up the precious 140 character-long Real Estate. Message are generally succinct, but can have key-words highlighted within them, by prefacing the keyword with a hash-character, '#'. These keywords can then be used in a Search. The search continues to run allowing a user to 'follow' a particular keyword, just as they would a particular user.
These experiments required the use of several Twitter accounts. These were set up in the standard way, but each required a new e-mail account; E-mail accounts are the way that Twitter ultimately judges the uniqueness of each account. The e-mail accounts were all aliased to a central place to allow notification e-mails to be read centrally, and acted upon as necessary.
An account was set up, initially called, '@zois_jcpm' and the author 'lurked' there to learn the conventions and manors of Twitter users. Eventually this account then renamed and a number of other accounts set-up as Pilots, to what was hoped would be a national system, encompassing all the vacancies that the Jobcentre Plus system had to offer.
The initial notion was that an RSS[rs] to Twitter service could be employed to provide a series of Tweets based on existing RSS feeds. RSS has been an important part of the Jobcentre Mirror Database since its inception. It possible to Twitter RSS feeds[tr] and these were explored but eventually abandoned. It was decided that the most durable approach would be the based on the production of specialised Tweets.
An initial pilot based on West Cumbria was set up. This had parallels in the set-up of the initial Jobcentre Plus Mirror, which started as an indignant response to the closure of a local Jobcentre Plus Office. Automated feeds were set up for these accounts. These feeds were written in Perl using the Net::Twitter module.
The above Twitter accounts are automated, one was set-up specifically for feedback and other human interaction (@JCPM_Feedback). This was monitored as closely as possible and feedback responded too. In the argot of Twitter, responses were made to 'mentions', 'direct messages', and if they seemed a legit Human 'follow-backs' would be undertaken.
There are some restrictions on the automated Twitter feeds. Not all of the postings on the Jobseekers Direct web-site; scraped and finding themselves in the JCP Mirror Database; on the FTP site and so forth wound-up being tweeted. Twitter's flakiness saw to that, but in addition it had been decided not to tweet the internal messages which seem to not actually be for vacancies but rather back-to-work courses and so forth. An abbreviation was made of what little of the description we can put in the Tweet so that as much of the goodness gets through as possible. Messages are mechanically search for 'boiler-plate' announcements about Local Partnerships, Agency postings and so forth, and these are eliminated.
Originally, with the aim of reducing clutter, and consequent pollution to search results, the vacancy tweets are automatically deleted after one week. The accent is on immediacy and notification. If a search of something a little more long term is desired the the regular Jobcentre Plus Mirror search was recommend. And indeed, even the official Jobseekers Direct one.
Subsequent experience with Twitter Search, which is critical to this work suggested that the deletion of 'expired' tweets was pointless. So it was stopped.
Twitter uses terms to describe its actions and they are used within this narrative. Although well documented elsewhere on the web, the following are presented as a handy aid-memoir.
- A posting made on a Twitter account. These can contain highlighted key-words (prefaced by a '#'); references to other Twitter accounts (prefaced by '@'); Twitter search terms (prefaced by an '$') and URLs which are mapped to shortened URLs[su]. There is an absolute requirement that these postings be shorter than 140 characters. Brevity is therefore imperative.
- A Subscriber to a series of Tweets, generally posted by a single account. Subscribers can be 'followers' of multiple accounts and be 'followers' in their own right. This is the basic connecting mechanism in Twitter.
- Re-tweet (RT)
- A Twitter posting (or Tweet) that has been reposted to a subsequent account.
- Direct Message (DM)
- Where accounts are mutually followed, a Direct Message can be sent between them. Again this is limited to 160 characters and can contain embedded hash-tags and shortened links.
- Keywords highlighted in a tweet by used of a prefacing '#' character. They are highlighted in Twitters display and are linked to an underlying search. A hash-tag can be followed in much the same way as a user by allowing a search to run in real-time.
- Account names (for example JCPM_Feedback) may be prefaced within a Tweet with an '@' character (so, @JCPM_Feedback). When done so the account so mentioned receives a notification that it has been 'mentioned' in another users Tweet.
- Words or phrased (frequently hash-tagged) that are actively being used by multiple times by multiple accounts. This 'trending' activity is posted on the initial Twitter page on the web-site, and represents what the Twitter community finds fashionable.
SOC Codes and Hashtags
One of the small bits of Feedback that has come back about the JCP Mirror Twitter pilots has seen is that there should be some kind of tagged key words. Since the text offered by the Jobseekers Direct, and ultimately the potential employer is just a rather wordy description, this is rather hard to do.
Happily, however, most jobs are now given a SOC code[sc]. These can and have been used to provide just such a crude key word and they have been added to the Tweets. The opinion of what this key word should be was the authors.
Peripheral to this Twitter work, the link at the bottom of the job description page. has been beefed-up. Clicking on this link will produce an instant Twitter posting, that can be edited with prior to posting on some other Twitter account. If the intended user is not logged-in as such, the user should be prompted to do so by Twitter.
Toward a National Twitter System
After the West Cumbria pilot was up and working, and indeed, attracting some positive feedback, thought was given to a national version. In this respect a path was followed that had been made other parts of the Jobcentre Plus Database Mirror system.
The approach evolved as the Twitter ecosystem was explored. Setting up a National system proved rather harder to execute than naïvely first thought, and has now largely been abandoned.
The initial idea was to setup a series of Twitter accounts that would post Twitter vacancies exclusively made to that office. There would thus be a Twitter account per active Jobcentre Plus Office. Jobcentre Plus offices are identified by a three letter code ('WRK' for Workington for example) and these form part of the reference found in the vacancy details. It reflect the underlying Labour Market System (LMS) upon which the Jobcentre web-site and subsequent Jobseekers Direct web-sites were based. It would essentially be the Cumbria Pilot written large.
To do this on an office-by-office basis would demand that 800 or so Twitter Account built and maintained automatically. This is a no-no as far as Twitter is concerned, and mechanisms such as 'Captchas' are employed to discourage this.
A second idea, then, was that a large sub-section of the new postings would be dumped in a single account, all Geolocated and hash-tagged up, so that folk could search this stream. It would exploit the observation that if a search-term is used in at least the web-site version of Twitter, it is periodically updated with new tweets. The mechanism extends to specialised Twitter clients, but not as far as an SMS delivery mechanism.
On a typical working day some 7,000 postings are made to the Jobseekers Direct web-site, and are thus scraped and appear in the Jobcentre Mirror database. Although an attempt was made to filter the vacancies have been posted directly by employees; removing 'agency' postings, and internal course postings by the Jobcentres themselves, the volume was still very large. So large, in fact, that it ran into poorly documented rate limits imposed automatically by Twitter.
In order to keep the signal-to-noise level as high as possible the stream of jobs was limited to vacancy data that did not indicated that it had been posted by an Agency. The decision was somewhat arbitrary, it is granted, and was imposed in addition to the existing Twitter conventions. When the rate-limits were discovered plans were made to sub-divide the national stream into a number of large regions, the whole of Scotland, North-west England, that sort of thing, but this approach was abandoned in the light of experience with Twitter's search mechanisms, which were central to the large, multiple-office accounts.
'Following' any of these 'National/Regional' accounts would mean that you would necessarily encounter a large number of irrelevant vacancies from far away. The effect would necessarily be like trying to drink from a fire-hose. This was realised early on and a mechanism devised to both flag each tweet with the Jobcentre's three-letter code in question using a hash-tag (Workington thus became '#jcpWRK' and a hash-tag based on provided location data ('#Workington'). Further, the tweets were geolocated with a crude representation of the location of the job based on gleaned postcode data. While this meant that the location of the vacancy would be displayed somewhat uselessly on a small map, it also meant that it could be searched for geographically using Twitters 'Nearby' search.
The hash-tag approach was also extended to job classification too. The SOC code that is now associated with the majority of vacancies would be used to provide a rather crude hash-tag classification of a job.
It was thus hoped, that by building a judiciously crafted search a continuous stream of just the vacancies required in a set geographical location could be followed, just like the 'trending' schemes that one observes when using the Twitter web-site.
@JCPM_National is Repositioned
As the JCPM_National's truly national ambitions had been abandoned, some thought was given as to what the account might be used for. Using SOC codes it is possible to divide the Jobcentre Mirror database up on crude job categories. Some of these categories are of high-value that probably transcend the geographic constraints that Jobcentre Plus jobs normally enjoy. It could therefore be used it to provide a stream of high-value skill-based vacancies, where geography is less important.
Such things do regularly appear, and it was was always tempting to dismiss them as employment legislation box-ticking exercises.
Again users would find this Account through initial judicious searching and then would subscribe to it through a variety of means.
The author started as a Biochemist and this skill was considered for use this in my experiment. As a result the National feed is now exclusively of vacancies that have been categorised as having 'SOC' code 2112: "Biological scientists and biochemists".
As with other regional based pilot Twitter feeds, the expansion of
this experiment was dependent upon feedback.
The Twitter experiments are not considered a success.
The pilots have attracted little traffic because of poor search results. The pilots may have been marked down in search results, and the unique hashtags associated with them return no results.
Searching using terms in test messages do not result in badly ranked results, but rather no result what so ever. This extends to third-party twitter searches and to searches made on more generalised engines such as Google or Bing. It leads to conclusion that these are barred from searching Twitter content, but in some selective way.
This observation has been confounded by account suspension. See Updates.
At time of writing this could be demonstrated by doing a search for the uniquely contrived hash-tag '#jcpLPP' which should return automated tweets made to the Leeds Park Place JCP Twitter Pilot (@JCPM_Leeds). Infact it returns nothing, even though the pilot is otherwise quite active. If you were unaware of the Pilot and searched it manually, you will not find any vacancies.
Promotion by a human seems to have little effect on this, as Tweets mentioning discounted accounts appear to be marked down in search too, although the evidence for this is thin.
Target-audience followers appear to have found the pilots, particularly the Cumbrian ones by word-of-mouth. Non-target audience follows appear to have been generated them by non-twitter search[tj], possibly returning results of some antiquity.
The initial concern was that of capacity. A single National account, attracting just 1% for the currently registered unemployed would have resulted in a 'follower' list in excess of most celebrities, who frequently count such things in the hundreds of thousands. As these folk would examined either the source material on the Jobcentre Mirror site (home.zois.co.uk) or via the redirection services to the original data (jobseekers.direct.gov.uk) it was a worry that this would saturate the slightly-better than ADSL line that these experiments depended upon. In the end this worry was unfounded.
The heavy dependency on resource discovery, and that Twitter acts to
hamper this, seems to be the primary finding of these
experiments. Concerted effort, almost certainly involving expenditure
and advertising would be required to overcome these obstacles and,
within the current budget of
ZOIS, these are
insurmountable. Even then, the cooperation of those who are guardians
to search engines will almost certainly be required and one must
wonder at what their objectives are.
An interesting side-observation on all this is that to be
successful on Twitter one needs to have a significant non-twitter
media presence. Hence Twitter's some what vapid cult of celebrity.
As with other Technical Notes, feedback is actively solicited. The author may be contacted via the e-mail address found on his public biography page[au]. Should something require changing or enhancing then the fact will be acknowledged with attribution in this Update section.
- Twitter Now Suspending Accounts
- Twitter has started suspending the automated Twitter pilots, arbitrarily, and without notification. As a result some of the explanation in this paper may reference things that no longer exist. 2012-03-06
References found in this section, and in particular the HTML links were correct at time of writing (2011-07-01).
- [au] Martin Sullivan:
- [jp] Java Portlet Specification - JSR 168:
- [nj] The Unofficial National Jobcentre Plus Mirror:
- [tr] RSS to Twitter Feed:
- [sc] SOC Codes on JCPM:
- [tj] TwitJobSearch:
- [rs] Real Simple Syndication:
- [tw] Twitter:
- [nt] Net::Twitter Perl Module:
- [pl] Perl:
- [su] Twitter's Link Service: