Some Reflections on Sweeper from N.E.A.T. Nigeria

In April we were contacted by a group out of Georgia Tech, M.I.T. and student on the ground in Nigeria about the, then, upcoming elections. This group of individuals, together working as N.E.A.T. (the Nigerian Election Aggregation Team) wanted to run a campaign that mashed up data from several different Ushahidi deployments, Twitter and other sources, displaying them in their own Ushahidi deployment. They ended up writing a lot of custom code but this was the first ‘stress test’ of the SwiftRiver platform and our Sweeper application to date.

The following is a review of the N.E.A.T. team’s experiences with Sweeper. It was written by Thomas Smyth from Georgia Tech just after their election project was complete on May 2:

What Sweeper Did Well

  • Quick setup: Jon had our instance up in running in what seemed like a heartbeat. This was much appreciated.
  • Reliability: Sweeper stayed up pretty reliably as long as I didn’t break it!
  • Auto-Tagging: This feature was pretty neat and our system used Sweeper’s tags for meta-analysis.
  • Support: Matt was available consistently for in-depth help and scheming. We appreciated this.

Issues With Sweeper

  • Bugginess: Several major bugs were encountered, e.g. the duplication service. But this is to be expected for a young project.
  • Twitter lag: Twitter updates weren’t showing up for many minutes. Since Twitter was our main source of timely information, this was a big problem. We ended up implementing our own scraper using Twitter’s stream API, which has worked brilliantly. Matt and I have discussed this.
  • Searching: Sweeper currently doesn’t allow searching of reports, and this was a desired feature which we implemented. We also implemented a ‘saved search’ feature, which turned out to be quite useful. It allows the user to specify a search string (such as “guns or bombs or knives”) to be “tracked”. The system then searches all incoming reports and maintains a time series visualization. This allows a user to see what topics are ‘spiking’. Something like this would fit nicely in the the analytics panel in Sweeper.
  • Analytics panel: There are a few good things here but the interface could be a lot denser, so that more useful analytics could be added. For instance, top tags could be represented with a compact table rather than a bar chart. Charts should only be used in cases where the visual representation provides a clear benefit. Pie charts are usually unnecessary, etc.
  • Geolocation problems: The automatic geolocation service was quite dodgy. I didn’t do any actual counting but I’d say upwards of half the results were wrong. I think it’s a difficult thing to do automatically. So much ambiguity, etc. We ended up building a custom solution for geolocation, incorporating polling booth data (120k of them!) from INEC. The system could automatically recognize a polling unit code like 03/04/12/013 in a tweet, and translate that into a geolocation.
  • Scanning interface: The main interface of sweeper, where users quickly scan through reports and categorize them, could be more efficient. It’s not clear why each report needs to take up so much space, and why the interface doesn’t scale to fit the whole screen. The animations were also somewhat disorienting. In our system, we tried a system where users ‘checked out’ a batch of 10 reports and quickly scanned them in a compact table format, marking relevant ones with a checkbox. This seemed to work nicely, and didn’t require (I think) as many requests to the server. In general, I think Sweeper’s interface could be tightened a lot. Users are more likely to be experienced, frequent visitors, rather than occasional ones (I think). Therefore you can make it a little more efficient and specialized than a general purpose website. I think users would appreciate this. I’d be happy to consult further here if there is interest.
  • Code and documentation: Much of the functionality described above could perhaps have been added to Sweeper. However, we found it hard to get started on adding plugins. The codebase could be better organized so that it is clear where code for different components should go. The code itself could also be cleaner in places. Also, documentation needs to be available. But again, we realize Sweeper is a young project and these things are surely on the TODO list!

That’s all I have for now guys. Let me know if you have any questions. Many thanks for everything. Let’s keep talking!

This is great feedback and some of it we’ve already begin working on, while the rest (both the code and the suggestions) have been added to our roadmap.

(Photo from http://www.uiowa.edu)

Open Source Bookmark Curation

With the latest release of Sweeper, you can roll your own bookmarking service. This is really powerful when you start activating plugins like our auto-tagger SiLCC or our our Push plugins which can output all of your bookmarked content as a feed that can be consumed by other applications.

We call this little plugin Quiver. It’s where you manually collect and store information using Sweeper. Essentially it turns Sweeper into a your free and opensource Delicious clone, with all the contextualization and aggregation features that people have come to love it for.

So how does it work? It’s simple! Just download and install any version of Sweeper following the current release of v0.3.2 which can be found here.

Once you’ve done that, go to the ‘sources panel’.

Select ‘Quiver’ from the list.

Drag the bookmarklet to your browser bar.

Done! Sweeper is a tool for the curation of real-time media. Now the things you find interesting can be mashed up with the content you’re aggregating from the web, twitter, email and other feeds! It’s particularly useful for journalists or researchers who need the real-time content, but who want to augment that with their personalized interests and findings.

Get it from Swiftly.org

Sweeper V0.3.1 Released - Twitter Streaming and Proxy Support

Download Sweeper V0.3.1 With Support for Twitter Streaming API here

Any of you keen Swiftriver lovers out there will remember the recent launch of our 0.3 version of the Sweeper app. The release went well and we got loads of feedback.

We were able to identify a couple of bugs in the release that were causing headaches for you guys and girls out there trying to set Sweeper up. So we fixed them …

Fixed the mysql table engine bug, thanks to everyone on the Swiftriver google group for finding and helping to fix this!

Fixed the link back to original content in the content source popup.

So why didn’t we give you these fixes a week ago … well we wanted to hold off as we were sooooo close to finished two other pieces of work that it didn’t seem right to give you the bug fixes without a little ‘we’re sorry’ present, so here it comes:

  1. Swiftriver and Sweeper now support collecting content from the Twitter Streaming API. This is a massive win for our software and we know that there will be loads of you who want to check it out. Its still early doors on this at the moment and there are a few things that we haven’t quite got right – for example we cant tag and translate all the content coming from a stream yet – but we didn’t feel that this was reason enough to hold you dudes back. So if you LOVE data and want to see a whole lot of it, download the latest V0.3.1 release of Sweeper and get cracking. 
  2. Sweeper can be now be deployed behind a proxy. This is great news for any of you wanting to use Sweeper on your corporate networks and we know you have had issues with this in the past, well they are now truly in the past!!!

We hope that you love this latest mini release as much as we do and as always, we love your comments, questions and criticisms (yes we really do!)

Just a small note for those of you who are already running a version of Sweeper but who want to take advantage of the new features in this release … while Sweeper does not yet officially support upgrades, it is possible. Basically all you have to do is dump the new code over the existing install then run through the installer (http://[where-you-installed-sweeper]/installer) and when you get the ‘create admin user section’, just skip it (just click on ‘6:Proxy Server Setup’ at the top of the page) without entering a new admin password. Hope this helps, and we are working on an official upgrade path to make adoption of the latest versions of Sweeper easier in the future!

So that’s all for now folks, till next time,

Matthew Griffiths,

Director of Platform

SwiftRiver

Download Sweeper V0.3.1 With Support for Twitter Streaming API here

Translating Realtime Social Media

One of the problems a lot of crowdsourcing projects have is that they end up pulling in massive amounts of data from the web, Twitter and other channels from around the world. This means content arrives in many different languages, often languages that the deployer doesn’t speak.

Currently in Sweeper and soon in Ushahidi, users can translate real-time content from one language into another, on the fly, as they receive it. This is done using our Google Translate plugin which currently supports 50+ languages.

For the Sweeper deployment we’re using to monitor the situation in Japan internally, we’re using this feature to monitor events, since we can’t manually translate every single message coming through. We’ve found it a significant timesaver. You can also see below that we’re showing the user what language the message was translated from, or if it’s been translated at all…

Before:

After:

It’s important to understand, that this is machine translation, so it’s far from perfect. But if you’re monitoring feeds from multiple countries across Twitter, RSS, Email or SMS it’s sometimes useful enough to get a quick sense of what’s being said, where to potentially look for more info, or perhaps where to direct human translators.

Sweeper v0.3.0 Released

Download Sweeper V0.3 Now - Click Here!

Hi all you Swiftriver and Sweeper followers out there.

The Swiftriver team is over the moon to announce the launch of our latest version of the Sweeper app.

Those of you who have been following our progress will know that this release comes hot off the heels of the V0.2 that we pushed to you all a couple of months ago.

As always, the Swiftriver guys and girls have had their heads down crafting and coding some great new features that go a long way to making Sweeper bigger and better than ever before.

So, what can you expect to see out of the V0.3 release?

The Sweeper Dashboard

Users of the new release will be greeted by a lovely shiny new dashboard that makes use of our custom built analytics module and the great jQuery graphing library jqplot [http://www.jqplot.com/] (thanks Chris for this easy to use and powerful tool). You can expect to see a lot more data visualisation in upcoming releases and we intend to utilise the power of the new analytics module throughout the Swiftriver family. 

Tag Based Navigation & Channel Based Navigation

Building our powerful tagging service, the latest version of Sweeper now offers users the ability to refine their view of content based on tags that are important to them. Simply click on any tag in the content list and see the list repopulated with only content that contains that tag!
Want to compare and cross check what people are saying on Twitter with what they are posting to flickr? Well now you can with Sweeper V0.3. The new channel based navigation filter makes it dead easy to view only content collected by a specific channel while still continuing to collect and process content from all over the web.

 

 Content Clustering

This is a great new tool that allows you to see how similar other content is to a new piece of content. Make sense? Well basically, every new piece of content that comes into Sweeper now has some scores attached to it, showing you how its set of tags match up to the larger set of tags in the system. To make this really relevant, we first show how similar this content is to all other content, then we show how similar it is to content you’ve already marked as ‘accurate’. We expect this feature to grow and grow and the Swiftriver team think this will be a key metric in the battle to cut through the noise and get down to only really interesting content.

Content Refresh Message

The new version of Sweeper keeps you updated with messages about how many new pieces of content have been collected!

As always, we didn’t manage to cram everything into this release that we would have liked to – but no worries, it just makes the next release all the more juicier!

So before I go, I can give you a little taste of some of the things you can expect to see from the next release:

  •  Revised Sweeper Panel
    We have been doing some work internally on improving the user workflow around voting on content, expect the next release to have some jazzy new buttons for you all to press!
  •  Pluggable Content Ordering Module
    Building on some of the great statistical tools we have in Sweeper already, the next release will ship with a whole new plugin framework (in addition to our Parser and Turbine models) that will allow you to change the order in which content is delivered based on things like location, RiverID score, tag clustering etc. And because it’s pluggable, you can dream up your own wild and wonderful ways of sorting the quality from the noise!

Well that really is it for now. Thanks for following our project and please get in touch if you have any issues, suggestions or would like to contribute.

Thanks as always to the Swiftriver team and our brothers and sisters over at Ushahidi for making it all possible!

Matt

Director of Platform, Swiftriver

Developing Plugins for SwiftRiver Applications

Ahmed Maawy, the newest hire to the SwiftRiver project, recently compiled this great ‘how-to’ guide on writing plugins for SwiftRiver applications like Sweeper and SwiftMeme.  These plugins can mostly be found at http://plugins.swiftly.org while the wishlist for things we’d like to see built can be found here.

For a great example of how Swift plugins work, check out the Ushahidi Report Push plugin, which allows content verified in Sweeper to be passed along to Ushahidi.  Coupled with the Yahoo Placemaker plugin, this is really powerful as it allows all content to pass from Sweeper to Ushahidi, auto-geolocated.

You can view a fully formated version of this guide on Google Docs


Before we begin it is worth noting that all SwiftRiver applications have 3 major components:

  • SwiftRiver Core - the engine behind content retrieval, processing and storage.
  • Installer - in charge of initial setup of the SwiftRiver platform.
  • Sweeper - Sweeper is the application built on top of the Kohana PHP framework that acts as a web application that renders or provides a UI on behalf of the operations performed by the SwiftRiver core.

There are 3 very important elements to understand for SwiftRiver applications when developing and extending the platform for customized functionality (These 3 elements can be considered as “plugins” for SwiftRiver).

  • Impulse Turbines - Are elements that process and add value to content received from external sources.
  • Reactor Turbines - Are event handlers, and are not necessarily meant to add value to content but to react to specific events within SwiftRiver.
  • Sources - Are parsers for different types of content. They are responsible for retrieving content from the Internet or other relevant sources, and translating this content to SwiftRiver content items, so that content from different sources can all have a uniform format within SwiftRiver.

Its is important to note that the SwiftRiver /Modules folder contains a number of these event handlers (Reactor turbines and Impulse turbines). However, Sources (Also known as Parsers) are developed within the /Modules/SiSPS/Parsers folder.
 
This is a step by step approach regarding how content is received and processed within the core:

  • Parsers take the content from the various external sources, and convert it to the Swift object model.
  • Impulse Turbines may act on the SwiftRiver content items and add value to these content items.
  • Reactor Turbines may be used to work on the end result of the content either before they are processed by Impulse Turbines, or during their processing cycle, or anytime within the lifetime of the content after specific user actions (such as mark content as accurate).

Parsers / Sources

Parsers are located within the /Modules/SiSPS/Parsers folder and follow the following important rules: 

  1. Have to have a <Parser_Name>Parser.php file name format
  2. The class name has to be the same as the file name
  3. The class name must implement the IParser class
  4. It must be within the namespace SwiftRiver\Core\SiSPS\Parsers
  5. Must contain the following functions:
  • GetAndParse($channel): returns an array of Content Items
  • ListSubTypes(): Returns the sub types of the Parser
  • ReturnType(): Returns the type of the parser (Which has to have the same name as the parser you specified in <Parser_Name>
  • ReturnRequiredParameters(): Returns an array of the parameters required to initiate a single source entry for this parser.

You may take a look at how content items for Twitter are generated to get an example on how parsers work. Content Items are also passed back together with Source data where available. You may also need to know how the object model for a Channel, Source, and Content are structured. These classes are located within the /ObjectModel/ folder.

Impulse Turbines

Located in the /Modules/ folder. Use the following important rules:

  1. Have a <Module_Name>PreProcessingStep.php file name format.
  2. The class name has to be the same as the file name.
  3. The class must implement the \Swiftriver\Core\PreProcessing\IPreProcessingStep class.
  4. Must be in the namespace Swiftriver\PreProcessingSteps 
  5. Contain the following functions:
  • Process($contentItems, $configuration, $logger): Which does processing on the content items.
  • Name(): Returns the impulse turbine name.
  • Description(): Returns the description of this pre-processing step.
  • ReturnRequiredParameters(): Returns an array of required parameters for the pre-processing step.

You may refer to the file GoogleLanguageServicePreProcessingStep.php in /Modules/GoogleLanguageServiceInterface/ folder for an example.

Reactor Turbines

Are located in the /Modules/ folder with the following important rules:

  1. Have a <Module_Name>EventHandler.php file name format.
  2. The class name has to be the same as the file name.
  3. The class must implement the \Swiftriver\Core\EventDistribution\IEventHandler class.
  4. Must be in the namespace Swiftriver\EventHandlers 
  5. Contain the following functions:
  • HandleEvent($event, $configuration, $logger): Contains the event code.
  • Name(): Returns the impulse event.
  • Description(): Returns the description of this event.
  • ReturnRequiredParameters(): Returns an array of required parameters for the event.
  • ReturnEventNamesToHandle(): Returns an array of the event enumerations the turbine tends to handle.

You may refer to the file UshahidiAPIEventHandler.php in /Modules/UshahidiAPIInterface/ folder for an example.

Important notes to consider during the EventDistribution phase 

  1. The ReturnEventNamesToHandle() function points to an enumeration from the /EventDistribution/EventEnumeration.php file. This is where you can design your own custom enumeration.
  2. It is most appropriate to place event handlers within the application’s workflow. Application workflows are placed within the /Workflows/ folder. For example, all workflows related to channel activities are placed within the /Workflows/ChannelServices/ folder.
  3. The example below demonstrates how you would invoke an event within a specific place within the workflow:

$event = new \Swiftriver\Core\EventDistribution\GenericEvent(

\Swiftriver\Core\EventDistribution\EventEnumeration:: $ContentPostProcessing,

$processedContent);

$eventDistributor = new \Swiftriver\Core\EventDistribution\EventDistributor();

$eventDistributor->RaiseAndDistributeEvent($event);

Please feel free to contact the SwiftRiver team for any further assistance and help. You can contact us by emailing support@swiftly.org

Sweeper v0.2.0 Released

It’s been a while since our last major release of product, this is partly because we’ve been working on big projects like the Queensland Floods and our project with Product (RED), and partly because of the holiday break.  However we’re back with a ton of goodies for Sweeper users in 2011.  

We’re happy to release the newest build of SwiftRiver:Sweeper today.  Here’s a rundown of features from Director of Platform, Matthew Griffiths.


Today we are releasing the latest version of our Sweeper app!

Sweeper 0.2 doesn’t bring a whole lot of new UI wizardry but under the hood we have been beavering away at cool addition after cool addition – plus the odd fix or two for things we didn’t get quite right last time.

In short, some of the wonderful magic you can expect to see in this new release are:

Improved pre-install checks in the Installer

We know that we had some issues with our installer last time out and we have been working hard to fix them for this release. Expect to see new checks for pre-requisites and improved checks around the requirements for the Kohana framework.

Yahoo Placemaker Turbine

By activating this impulse turbine the text of any content coming into Swiftriver can be sent to the ever popular Yahoo Placemaker Service and if they mention a recognizable location then the coordinates of that location can be added to content.

Ushahidi Report Push Turbine

The first – and arguably most important – reactor turbine to be release for the Sweeper app. With the release of this Reactor Turbine you can now twin your sweeper instance with your Ushahidi instance! Something we know a lot of you out there have been waiting for!!!

A ton of new Source Parsers

Allowing users to aggregate content directly into Sweeper from all of these new sources:

  • Eventful
  • Flickr
  • FrontlineSMS
  • Google News
  • Email
  • Meetup.com

For those of you out there who are interested in the development side of Sweeper – and Swiftriver as a whole – there has been a whole host of activity that isn’t covered above. Look away now if you are easily offended by techie speak!

We have completely remodeled the presence of Swiftriver on GitHub – those of you who watch the repo will have noticed it already.

We now have separate repos for the main open source component of Swiftriver (you can find them by going to the Ushahidi page on git hub [http://www.github/Ushahidi] and looking for Sweeper, SwiftMeme, SiLCC etc.

As a consequence of this, the old SwiftRiver repo now holds only the framework files and not the individual applications. This is also the reason for the change in version numbering Sweeper and SwiftMeme are at v0.2.0 while SwiftRiver core remains at v0.6.0.

There are some complexities about working with this new repository structure but for all those dev’s out there interested in contributing to the project I will be blogging a little later this month on how exactly to get started with any of the apps.

So, I think that about all for now. Have fun with Sweeper V0.2 and as always give us your feedback if you have any!

Download Sweeper v0.2.0

SwiftRiver Web 101 | Sept 23

swiftriver logo

Are you interested in the SwiftRiver platform?  Do you want to get a better understanding of our products or find out how to install them?  Well, we’re slowly catching up with demand for documentation, instruction and new features etc.  In the meantime, we cordially invite you to sit on your couch, in your jammies, with a big bowl of Lucky Charms and soymilk (that’s what I’ll be doing) and attend the first ever SwiftRiver Web 101 training seminar.

The event will be held Thursday September 23 8:00am - 10:00am PST/GMT-8

This event is free but there are only 15 slots for potential attendees so sign up quickly.  We can also accomodate multiple people from the same organization.

We’ve blocked out two hours from our schedule to make sure you can ask all the questions you could possibly ask, whether it’s about installation, security, APIs, code, developing plugins or SwiftApps, integration with Ushahidi/Crowdmap and more.

Planned Topics

  • Brief overview of our web APIs
  • Brief overview of system architecture
  • Sneak peak at new apps
  • Installing the Sweeper app
  • Using the Sweeper app
  • Q&A

We’ll also debut three new features for the Sweeper app and release the next build (including the new features) early, to all the people in attendance! To attend, visit the following link - Click Here to Register for SwiftRiver Web 101