Fundamentally, you can't fool Mother Nature in computers, either - Andy Grove - Only the Paranoid Survive
IBM IS WORKING ON a system that can pinpoint people's locations by looking at their Twitter history.
This does not sound like much of a leap. Twitter offers location sharing. Many people do not enable that though, and people are left wondering where someone is located when they tweet.
However researchers at Big Blue have worked out an algorithm that can place a person by looking at their 200 tweets without location information.
A paper from IBM entitled 'Home Location Identification of Twitter Users' and penned by Jalal Mahmud, Jeffrey Nichols and Clemens Drews has been published to online repository Arxiv.
You know how these papers go, they begin with an abstract, and the abstract of this one says that the algorithm infers the home location of tweeters by looking at time zones and heuristic classifiers. This, say the researchers, represents a new approach.
"Unlike existing approaches, our algorithm uses an ensemble of statistical and heuristic classifiers to predict locations and makes use of a geographic gazetteer dictionary to identify place-name entities," they said.
"We find that a hierarchical classification approach, where time zone, state or geographic region is predicted first and city is predicted next, can improve prediction accuracy. We have also analysed movement variations of Twitter users, built a classifier to predict whether a user was travelling in a certain period of time and use that to further improve the location detection accuracy."
So far, so good, and the IBM researchers reckon that evidence suggests that the algorithm works well in the wild and is adept at "predicting the home location of Twitter users".
This sort of thing will raise privacy hackles, and people will not like the idea that they can be pegged down without ever saying where they are.
IBM did not take the privacy angle much, and said that the information could be used to improve capabilities in data mining and event prediction. Examples given are journalists looking to find tweets from a scene, and retailers that want to know who is buying what and where. Another benefit, which sounds rather oblique to the previous aims, is that users will be better able to hide their locations, according to the firm.
"The benefit of developing these algorithms is two-fold. First, the output can be used to create location-based visualisations and applications on top of Twitter," IBM adds.
"For example, a journalist tracking an event on Twitter may want to know which tweets are coming from users who are likely to be in a location of that event, [versus] tweets coming from users who are likely to be far away. As another example, a retailer or a consumer products vendor may track trending opinions about their products and services and analyse differences across geographies."
The system might be helpful to users who want to blog on Twitter, but not reveal their location according to the firm.
"Our examination of the discriminative features used by our algorithms suggests strategies for users to employ if they wish to micro-blog publicly but not inadvertently reveal their location," it said.
IBM added that its focus is on working out where someone lives, and not their travels. However, it is possible to predict someone's movement using its method.
The researchers approached the study by looking at a collection of tweets with location information. They said that they began by identifying 100 tweeters from 100 cities and then worked from a dataset of 1,524,522 tweets generated by 9,551 users. µ
Tags: Social Media
Sign up for INQbot – a weekly roundup of the best from the INQ