SwiftRiver is an open source project that aims to democratize access to the tools for making sense of data. Find out more at swiftly.org or follow us on Twitter.
Main Discussion Facebook Our Wiki Newsletter @swiftriver @swiftdev
Ahmed Maawy, the newest hire to the SwiftRiver project, recently compiled this great ‘how-to’ guide on writing plugins for SwiftRiver applications like Sweeper and SwiftMeme. These plugins can mostly be found at http://plugins.swiftly.org while the wishlist for things we’d like to see built can be found here.
For a great example of how Swift plugins work, check out the Ushahidi Report Push plugin, which allows content verified in Sweeper to be passed along to Ushahidi. Coupled with the Yahoo Placemaker plugin, this is really powerful as it allows all content to pass from Sweeper to Ushahidi, auto-geolocated.
You can view a fully formated version of this guide on Google Docs
Before we begin it is worth noting that all SwiftRiver applications have 3 major components:
There are 3 very important elements to understand for SwiftRiver applications when developing and extending the platform for customized functionality (These 3 elements can be considered as “plugins” for SwiftRiver).
Its is important to note that the SwiftRiver /Modules folder contains a number of these event handlers (Reactor turbines and Impulse turbines). However, Sources (Also known as Parsers) are developed within the /Modules/SiSPS/Parsers folder.
This is a step by step approach regarding how content is received and processed within the core:
Parsers / Sources
Parsers are located within the /Modules/SiSPS/Parsers folder and follow the following important rules:
- GetAndParse($channel): returns an array of Content Items
- ListSubTypes(): Returns the sub types of the Parser
- ReturnType(): Returns the type of the parser (Which has to have the same name as the parser you specified in <Parser_Name>
- ReturnRequiredParameters(): Returns an array of the parameters required to initiate a single source entry for this parser.
You may take a look at how content items for Twitter are generated to get an example on how parsers work. Content Items are also passed back together with Source data where available. You may also need to know how the object model for a Channel, Source, and Content are structured. These classes are located within the /ObjectModel/ folder.
Impulse Turbines
Located in the /Modules/ folder. Use the following important rules:
- Process($contentItems, $configuration, $logger): Which does processing on the content items.
- Name(): Returns the impulse turbine name.
- Description(): Returns the description of this pre-processing step.
- ReturnRequiredParameters(): Returns an array of required parameters for the pre-processing step.
You may refer to the file GoogleLanguageServicePreProcessingStep.php in /Modules/GoogleLanguageServiceInterface/ folder for an example.
Reactor Turbines
Are located in the /Modules/ folder with the following important rules:
- HandleEvent($event, $configuration, $logger): Contains the event code.
- Name(): Returns the impulse event.
- Description(): Returns the description of this event.
- ReturnRequiredParameters(): Returns an array of required parameters for the event.
- ReturnEventNamesToHandle(): Returns an array of the event enumerations the turbine tends to handle.
You may refer to the file UshahidiAPIEventHandler.php in /Modules/UshahidiAPIInterface/ folder for an example.
Important notes to consider during the EventDistribution phase
$event = new \Swiftriver\Core\EventDistribution\GenericEvent(
\Swiftriver\Core\EventDistribution\EventEnumeration:: $ContentPostProcessing,
$processedContent);
$eventDistributor = new \Swiftriver\Core\EventDistribution\EventDistributor();
$eventDistributor->RaiseAndDistributeEvent($event);
Please feel free to contact the SwiftRiver team for any further assistance and help. You can contact us by emailing support@swiftly.org
Patrick Meier and some friends and users of Swift stopped by Affinity Labs in Washington a few days ago with some great suggestions and feature requests for the next release of our Sweeper application. Our work was largely centric around rethinking user interaction options. It was an exciting day and we’re really looking forward to incorporating these suggestions in our next release.
Check out the slideshow for some shots of our brainstorming session and the image below for a sneak peak at the proposed redesign.


SwiftRiver is an opensource project with the overarching goal to help people make sense of data on their terms. We do this by adding all types of elusive context to content: tags, predictions for accuracy, indicators of influence and location etc.
Location is incredibly important to us because one of our many objectives is to help users verify data and location often serves as a clue about whether content is accurate or not. For example, in this post Vladimir Ermakov describes how Swift attempts to auto-detect the location news articles refer to using statistical analysis of text. That algorithm needs a database of locations to work, however.
Particularly in the case of crisis-mapping, this is key. In applications like the Ushahidi platform, people can aggregate ‘reports’ about events and visualize that data geo-spatially. Because that data comes from the crowd, and because it all needs to be location based (for visualization), it’s critical that the location appended to the message be accurate…or at least as accurate as possible.
So contextualizing crowdsourced data through location is a huge priority. Another priority is ensuring that our platform work relatively the same offline as it might online. This means we want to ensure that our products rely primarily upon other open source projects whose source code can be deployed on a local machine or behind a firewall.
Recently we realized we were beginning to rely upon Yahoo’s Placemaker service for our location detection features. Yahoo is great, but to rely upon such a huge, proprietary product cripples the access for some users. We spent several months thinking about building our own alternative (see: SULSa), but ultimately it proved beyond our resources. So we set out to find an opensource alternative to Placemaker and we found one in the form of the GeoDict project.
GeoDict is an opensource project for pulling location information from unstructured text. Given our recent experiments in this same area, we found a the GeoDict project inspiring. So although it was an active project with a growing community, we invited the people behind it to allow SwiftRiver to officially adopt the codebase.
What does this mean?
We’re not sure entirely, but there are some things we do know. Both projects will remain available under the GPL. You’ll see us contribute our staff, time and resources to the development of GeoDict (because it’s an open source project aligned with our greater mission). GeoDict’s community will also actively contribute back to that code, and hopefully they’ll feel welcome enough that they’ll also contribute to SwiftRiver and Ushahidi code base as well.
GeoDict will be fully integrated into the Swift Web Services family of API products which we offer as both free and paid services, but also as open-source code for anyone out there to use on their own terms.
Big thanks to Pete Warden for creating GeoDict and for supporting our project. Welcome to the Ushahidi family!

Are you interested in the SwiftRiver platform? Do you want to get a better understanding of our products or find out how to install them? Well, we’re slowly catching up with demand for documentation, instruction and new features etc. In the meantime, we cordially invite you to sit on your couch, in your jammies, with a big bowl of Lucky Charms and soymilk (that’s what I’ll be doing) and attend the first ever SwiftRiver Web 101 training seminar.
The event will be held Thursday September 23 8:00am - 10:00am PST/GMT-8
This event is free but there are only 15 slots for potential attendees so sign up quickly. We can also accomodate multiple people from the same organization.
We’ve blocked out two hours from our schedule to make sure you can ask all the questions you could possibly ask, whether it’s about installation, security, APIs, code, developing plugins or SwiftApps, integration with Ushahidi/Crowdmap and more.
Planned Topics
We’ll also debut three new features for the Sweeper app and release the next build (including the new features) early, to all the people in attendance! To attend, visit the following link - Click Here to Register for SwiftRiver Web 101
SwiftRiver is a platform consisting of a number of unique products and technologies. The goal is to aggregate information from multiple media channels (SMS, Twitter, Email, RSS feeds from the web) and to add context: The ‘who’, ‘what’ and ‘where’ of that which is being discussed in each message. So, who the message is about, what it’s about, and where the message originated from. Swift then uses these details to help predict the relevancy of the information coming to the user. This allows us to promote content the user cares about while suppressing content they are less likely to (spam, inaccuracies, falsehoods, and crosstalk).
One of the technologies in the works for the Swift platform is RiverID, it’s a distributed reputation system. It works through a process we call ‘triage’, where two or more (usually three) types of data are compared to make insights that aren’t possible when looking at the data alone.
Let’s use the recent earthquakes in Haiti as an example of how this works. Let’s say we get a message that says “People trapped in a severely unstable building in Neighborhood X.” Our question becomes, who is telling us this? Can they be trusted, and is the information accurate? Traditionally all these questions have to be asked and answered on the fly. That creates a bottleneck on how much information an organization can process: they either put trusted people in the field or they work with vetted organizations on the ground. This isn’t possible for organizations who want to gather crowd-sourced reports. The problem still exists and it’s now amplified because there are even more anonymous people who need to be vetted.
With the above message there are a few ways to attempt verification of what’s being reported. So we might start with location. If we know the text message has originated from someone in Haiti (there are ways to do this, for instance just looking at the country-code is one way) that location information can then inform our triage dataset.
The second form of context we can attempt to add is corroboration. Are there other reports coming from the same general location and time that corroborate what this message is telling us? If everyone in Neighborhood X is saying that it’s a perfectly sunny day and the kids are playing outside, we have a conflict. Either the crowd is lying or the text message is. So we compare one message with others to see if the stories align, and that becomes an addition to our data set. This used to take a lot of human hours. We want to speed up that process by using algorithms and natural language processing.
The third data set (the last mile) is this all becomes fun because location and corroboration can tell us a lot but they aren’t always perfect indicators. So we attempt to look at history. Has this person reported anything before? If so were they reliable then? Do we know their telephone number? In other words, can we use history as context? This is where RiverID comes in. RiverID allows a user or organization to form a profile on a user’s communication graph. If I (as a user of Swift) know someone’s name, have their phone number, email address, blog url, and social network profiles I can store all that data as a profile of the source. Then in the future if I get a text message out of the blue from Haiti, it just may end up being someone who I have a profile on.
The text message is no longer coming from anonymous sources in the crowd, it’s now coming from an identifiable sources with unique histories. From that point it’s just a matter of looking back at that users history to try to make a decision. If they tend to be reliable and accurate, their RiverID profile will give the statistical advantage to actions they take to verify other reports.
Now, I should preface that we (at SwiftRiver) never have access to all that user data. Only the organizations using our platform do, it all happens on their servers or behind their firewalls. We never touch their data, nor would we ever need to, as every use of SwiftRiver is going to have different context, and subsequently differing needs. RiverID data might only be relevant in specific contexts. Essentially we’ve taken the idea of something like Facebook Connect and we’re making it completely opt-in, and completely decentralized (the user stores user profiles, we just reference their database). This allows the organization access reputation profiles unique to their groups needs.
On a final note, I should say that triage may not always consist of the same data types. In this case it was location, corroboration and user history; in other cases it might include things like the time of the report or accuracy (as determined by the user).