Developing Plugins for SwiftRiver Applications

Ahmed Maawy, the newest hire to the SwiftRiver project, recently compiled this great ‘how-to’ guide on writing plugins for SwiftRiver applications like Sweeper and SwiftMeme.  These plugins can mostly be found at http://plugins.swiftly.org while the wishlist for things we’d like to see built can be found here.

For a great example of how Swift plugins work, check out the Ushahidi Report Push plugin, which allows content verified in Sweeper to be passed along to Ushahidi.  Coupled with the Yahoo Placemaker plugin, this is really powerful as it allows all content to pass from Sweeper to Ushahidi, auto-geolocated.

You can view a fully formated version of this guide on Google Docs


Before we begin it is worth noting that all SwiftRiver applications have 3 major components:

  • SwiftRiver Core - the engine behind content retrieval, processing and storage.
  • Installer - in charge of initial setup of the SwiftRiver platform.
  • Sweeper - Sweeper is the application built on top of the Kohana PHP framework that acts as a web application that renders or provides a UI on behalf of the operations performed by the SwiftRiver core.

There are 3 very important elements to understand for SwiftRiver applications when developing and extending the platform for customized functionality (These 3 elements can be considered as “plugins” for SwiftRiver).

  • Impulse Turbines - Are elements that process and add value to content received from external sources.
  • Reactor Turbines - Are event handlers, and are not necessarily meant to add value to content but to react to specific events within SwiftRiver.
  • Sources - Are parsers for different types of content. They are responsible for retrieving content from the Internet or other relevant sources, and translating this content to SwiftRiver content items, so that content from different sources can all have a uniform format within SwiftRiver.

Its is important to note that the SwiftRiver /Modules folder contains a number of these event handlers (Reactor turbines and Impulse turbines). However, Sources (Also known as Parsers) are developed within the /Modules/SiSPS/Parsers folder.
 
This is a step by step approach regarding how content is received and processed within the core:

  • Parsers take the content from the various external sources, and convert it to the Swift object model.
  • Impulse Turbines may act on the SwiftRiver content items and add value to these content items.
  • Reactor Turbines may be used to work on the end result of the content either before they are processed by Impulse Turbines, or during their processing cycle, or anytime within the lifetime of the content after specific user actions (such as mark content as accurate).

Parsers / Sources

Parsers are located within the /Modules/SiSPS/Parsers folder and follow the following important rules: 

  1. Have to have a <Parser_Name>Parser.php file name format
  2. The class name has to be the same as the file name
  3. The class name must implement the IParser class
  4. It must be within the namespace SwiftRiver\Core\SiSPS\Parsers
  5. Must contain the following functions:
  • GetAndParse($channel): returns an array of Content Items
  • ListSubTypes(): Returns the sub types of the Parser
  • ReturnType(): Returns the type of the parser (Which has to have the same name as the parser you specified in <Parser_Name>
  • ReturnRequiredParameters(): Returns an array of the parameters required to initiate a single source entry for this parser.

You may take a look at how content items for Twitter are generated to get an example on how parsers work. Content Items are also passed back together with Source data where available. You may also need to know how the object model for a Channel, Source, and Content are structured. These classes are located within the /ObjectModel/ folder.

Impulse Turbines

Located in the /Modules/ folder. Use the following important rules:

  1. Have a <Module_Name>PreProcessingStep.php file name format.
  2. The class name has to be the same as the file name.
  3. The class must implement the \Swiftriver\Core\PreProcessing\IPreProcessingStep class.
  4. Must be in the namespace Swiftriver\PreProcessingSteps 
  5. Contain the following functions:
  • Process($contentItems, $configuration, $logger): Which does processing on the content items.
  • Name(): Returns the impulse turbine name.
  • Description(): Returns the description of this pre-processing step.
  • ReturnRequiredParameters(): Returns an array of required parameters for the pre-processing step.

You may refer to the file GoogleLanguageServicePreProcessingStep.php in /Modules/GoogleLanguageServiceInterface/ folder for an example.

Reactor Turbines

Are located in the /Modules/ folder with the following important rules:

  1. Have a <Module_Name>EventHandler.php file name format.
  2. The class name has to be the same as the file name.
  3. The class must implement the \Swiftriver\Core\EventDistribution\IEventHandler class.
  4. Must be in the namespace Swiftriver\EventHandlers 
  5. Contain the following functions:
  • HandleEvent($event, $configuration, $logger): Contains the event code.
  • Name(): Returns the impulse event.
  • Description(): Returns the description of this event.
  • ReturnRequiredParameters(): Returns an array of required parameters for the event.
  • ReturnEventNamesToHandle(): Returns an array of the event enumerations the turbine tends to handle.

You may refer to the file UshahidiAPIEventHandler.php in /Modules/UshahidiAPIInterface/ folder for an example.

Important notes to consider during the EventDistribution phase 

  1. The ReturnEventNamesToHandle() function points to an enumeration from the /EventDistribution/EventEnumeration.php file. This is where you can design your own custom enumeration.
  2. It is most appropriate to place event handlers within the application’s workflow. Application workflows are placed within the /Workflows/ folder. For example, all workflows related to channel activities are placed within the /Workflows/ChannelServices/ folder.
  3. The example below demonstrates how you would invoke an event within a specific place within the workflow:

$event = new \Swiftriver\Core\EventDistribution\GenericEvent(

\Swiftriver\Core\EventDistribution\EventEnumeration:: $ContentPostProcessing,

$processedContent);

$eventDistributor = new \Swiftriver\Core\EventDistribution\EventDistributor();

$eventDistributor->RaiseAndDistributeEvent($event);

Please feel free to contact the SwiftRiver team for any further assistance and help. You can contact us by emailing support@swiftly.org

Vote in the Knight News Challenge

Every year the Knight Foundation rewards innovation in technology primarily targeting professional and citizen journalists. The rewards are grants that help projects scale and improve their platforms.  We just entered and wanted to take some time to explain our vision and what we think makes us a worthy applicant.

What is SwiftRiver’s mission?  To democratize access to tools that can be used to filter and make sense of realtime information from SMS, Twitter, Email and the Web.

Where do we add value to news? SwiftRiver is free and open source. This includes apis for natural language processing, location detection, reputation & trust, duplication filtering and influence detection.

We make these tools open for two reasons: Firstly, because in large news rooms, staff want complete control over their platforms and they need to be able to modify and customize workflows as needed.  This tends to mean they develop similar tools in-house which is great for organizations with those types of resources, not so great for organizations who can’t.  Secondly, our goal is to make these advanced intelligence tools available to journalists in even the most remote, unconnected places. 

Who needs our products?  The strongest demand for SwiftRiver is actually from journalists who are increasingly overwhelmed by the task of sorting through vast streams of data.  We’re actually working with several different groups from around the world who want to use applications like Twitter and Facebook to gather news, who share the problem of identifying the kernels of reliable information amidst a sea of ‘noise’.

Why should you vote for us? SwiftRiver has gone from merely a concept that was laid out two years ago, to a tangible product over the last year on very limited resources.  

Although, we’re part of the Ushahidi family (still a small company in it’s own right), we don’t have access to the same financial resources or staff.  They all have their hands-full making Ushahidi the great product that it is.  Because we’re a small team, we can’t develop things as quickly as we might like.  Demand is way out-pacing our ability to deliver and scale.

We’re a very small team: one full-time person people, one part-time developer and we’ve only this month added a third.  

Who are you targeting? Swift is for people overwhelmed by data.  That’s a very broad problem that essentially effects everyone with a computer and connection to the internet.  This makes a singular audience difficult to suss out.  I like to say this: We built a platform and we’re using our platform to target different industries, primarily, data journalists.

There are many other uses of the SwiftRiver platform, many that people are discovering without our guidance and hopefully that means what we’re doing is powerful, adaptable, relevant in different scenarios, easy to use and most importantly accesible to all.

Vote or ask questions about SwiftRiver in the Knight News Challenge.

CGI2010 “Improving Access to Modern Technology”. Ushahidi Executive Director, Ory Okolloh on panel at CGI alongside Jack Dorsey (CEO, Twitter & Square), John Chambers (CEO, Cisco), Zhengrong Shi (CEO, Suntech), Ratan Tata (CEO, Tata) and journalist Krista Tippet.

It’s a great video to watch if you want a better understanding of why Ushahidi created the SwiftRiver initiative. This video was recorded at the 2010 Clinton Global Initiative conference this week in New York, NY, USA.

SwiftRiver Web 101 | Sept 23

swiftriver logo

Are you interested in the SwiftRiver platform?  Do you want to get a better understanding of our products or find out how to install them?  Well, we’re slowly catching up with demand for documentation, instruction and new features etc.  In the meantime, we cordially invite you to sit on your couch, in your jammies, with a big bowl of Lucky Charms and soymilk (that’s what I’ll be doing) and attend the first ever SwiftRiver Web 101 training seminar.

The event will be held Thursday September 23 8:00am - 10:00am PST/GMT-8

This event is free but there are only 15 slots for potential attendees so sign up quickly.  We can also accomodate multiple people from the same organization.

We’ve blocked out two hours from our schedule to make sure you can ask all the questions you could possibly ask, whether it’s about installation, security, APIs, code, developing plugins or SwiftApps, integration with Ushahidi/Crowdmap and more.

Planned Topics

  • Brief overview of our web APIs
  • Brief overview of system architecture
  • Sneak peak at new apps
  • Installing the Sweeper app
  • Using the Sweeper app
  • Q&A

We’ll also debut three new features for the Sweeper app and release the next build (including the new features) early, to all the people in attendance! To attend, visit the following link - Click Here to Register for SwiftRiver Web 101

On Triage and Verifying Crowdsourced Reports

SwiftRiver is a platform consisting of a number of unique products and technologies. The goal is to aggregate information from multiple media channels (SMS, Twitter, Email, RSS feeds from the web) and to add context: The ‘who’, ‘what’ and ‘where’ of that which is being discussed in each message. So, who the message is about, what it’s about, and where the message originated from. Swift then uses these details to help predict the relevancy of the information coming to the user. This allows us to promote content the user cares about while suppressing content they are less likely to (spam, inaccuracies, falsehoods, and crosstalk).

One of the technologies in the works for the Swift platform is RiverID, it’s a distributed reputation system. It works through a process we call ‘triage’, where two or more (usually three) types of data are compared to make insights that aren’t possible when looking at the data alone.

Let’s use the recent earthquakes in Haiti as an example of how this works. Let’s say we get a message that says “People trapped in a severely unstable building in Neighborhood X.” Our question becomes, who is telling us this? Can they be trusted, and is the information accurate? Traditionally all these questions have to be asked and answered on the fly. That creates a bottleneck on how much information an organization can process: they either put trusted people in the field or they work with vetted organizations on the ground. This isn’t possible for organizations who want to gather crowd-sourced reports. The problem still exists and it’s now amplified because there are even more anonymous people who need to be vetted.

With the above message there are a few ways to attempt verification of what’s being reported. So we might start with location. If we know the text message has originated from someone in Haiti (there are ways to do this, for instance just looking at the country-code is one way) that location information can then inform our triage dataset.

The second form of context we can attempt to add is corroboration. Are there other reports coming from the same general location and time that corroborate what this message is telling us? If everyone in Neighborhood X is saying that it’s a perfectly sunny day and the kids are playing outside, we have a conflict. Either the crowd is lying or the text message is. So we compare one message with others to see if the stories align, and that becomes an addition to our data set. This used to take a lot of human hours. We want to speed up that process by using algorithms and natural language processing.

The third data set (the last mile) is this all becomes fun because location and corroboration can tell us a lot but they aren’t always perfect indicators. So we attempt to look at history. Has this person reported anything before? If so were they reliable then? Do we know their telephone number? In other words, can we use history as context? This is where RiverID comes in. RiverID allows a user or organization to form a profile on a user’s communication graph. If I (as a user of Swift) know someone’s name, have their phone number, email address, blog url, and social network profiles I can store all that data as a profile of the source. Then in the future if I get a text message out of the blue from Haiti, it just may end up being someone who I have a profile on.

The text message is no longer coming from anonymous sources in the crowd, it’s now coming from an identifiable sources with unique histories. From that point it’s just a matter of looking back at that users history to try to make a decision. If they tend to be reliable and accurate, their RiverID profile will give the statistical advantage to actions they take to verify other reports.

Now, I should preface that we (at SwiftRiver) never have access to all that user data. Only the organizations using our platform do, it all happens on their servers or behind their firewalls. We never touch their data, nor would we ever need to, as every use of SwiftRiver is going to have different context, and subsequently differing needs. RiverID data might only be relevant in specific contexts. Essentially we’ve taken the idea of something like Facebook Connect and we’re making it completely opt-in, and completely decentralized (the user stores user profiles, we just reference their database). This allows the organization access reputation profiles unique to their groups needs.

On a final note, I should say that triage may not always consist of the same data types. In this case it was location, corroboration and user history; in other cases it might include things like the time of the report or accuracy (as determined by the user).

Panel at Twitter’s Developer Conference

On April 15th at this year’s Twitter developer conference, Tim O’Reilly moderated a panel with Katie Stanton (White House Director of Citizen Participation), Anil Dash (Expert Labs), and Patrick Meier (Ushahidi) called “Twitter as a Force for Good”. They discussed how Twitter was an incredible platform for information gathering for emergency response organizations during the Haiti earthquakes. Other topics of conversation ranged from how the US government keeps up with the latest web technologies, tech innovations originating outside of Silicon Valley and the U.S., and leveraging crowd-sourcing as a way to improve policy decisions. Check out video of the full panel below…


Watch live video from Twitter Chirp Conference on Justin.tv

Coordinating Software Developer Volunteers

One of the things we know about software developers contributing to open source projects is that they don’t have a lot of time. Everyone has their day jobs, their personal projects, their families…in other words life. We like to support a relaxed, but structured atmosphere where there’s things that need to get done but no pressure on any one volunteer dev.

As a group, they tend to like ‘sprints’ where several developers gather to get as much done as possible in only a few hours. Events like Crisis Camps, Where Camps and Dev Camps are really helpful in that they facilitate spaces where developers can come together to brainstorm and get things done.

However, the one barrier to entry many of them find is that they aren’t comfortable with the language that the rest of the community wants to use, or that the platform is built in. For example Ushahidi is built on the Kohana PHP framework, but a lot of developers prefer to work in Ruby or Python these days. In addition, location plays a role too. We have developers volunteering from every continent, across cultures; some languages are more popular than others across the pond. How do we approach solving this challenge to be as inclusive as possible?

[caption id=”attachment_1285” align=”alignnone” width=”500” caption=”Moses Mugisha, Ugandan Volunteer and Developer of SULSa”]Moses Mugisha, Ugandan Volunteer and Developer of SULSa[/caption]

We’re using the modular approach. Various components of our systems are built in various languages. The Swift River system itself is being built in PHP on Kohana, the same framework that Ushahidi uses. But SULSa (Swift User Location Services App) is written in Ruby using the Rails framework. Our taxonomy and natural language parsing program, SiLCC (Swift Language Computation Core), is being developed in Python. Ushahidi itself also has an API that anyone can use to pull or push data, using any programming language they want.

Internally, this modular approach allows us to scale, by distributing server load across many different nodes that each handle vertical tasks on their own. But when it comes to coordinating volunteer developers, it means that there’s always something someone can contribute to, which hopefully makes working with our community that much more inviting.

Interested in volunteering with us as a software developer? Check out the following links…