SiLCC for Wordpress v1.6

A few months ago we released a few plugins that extend the Swift platform and APIs to Wordpress.  One of these plugins was WP-SiLCC, an auto-tagger, that parses text and reapplies those tags to help authors sort content.

The latest build of this plugin is a minor overhaul.  First, WP-SiLCC now integrates TagThe.net, a great platform for auto-tagging long-form content.  Why?  Because SiLCC was built for short-form content (Tweets, Headlines, & Text Messages) which traditional natural language processors have a difficult time with.  The new changes allow SiLCC to switch between the two apis where appropriate.

There was a bit of refactoring and ironing out some bugs as well and we’ve now got a much better plugin.  You can download WP-SiLCC from Wordpress.org or find the code on GitHub.

SwiftRiver Releases Plugins for Wordpress



For all you Wordpress publishers out there interested in SwiftRiver there are two official plugins we’re releasing today that bring Swift to your platform of choice: WP-SiLCC and WP-Veracity.

WP-SiLCC



WP-SiLCC is an auto tagging plug-in. Users who run news sites or aggregators should consider using this to add a basic level of taxonomy to all posts. WP-SiLCC also allows users to tag their own posts for sites that prefer a more folksonomic approach. WP-SiLCC uses active learning techniques to improve how it parses text over time.

Download WP-SiLCC from Wordpress.org

WP-Veracity



WP-Veracity applies bayesian algorithms to your content to help surface posts based on “interestingness”, influence and time-published rather than popularity alone. From SwiftRiver’s perspective, popularity is only an indicator of influence, not necessarily an indicator of authority. This plug-in calculates popularity (number of hits, trackbacks, comments), a bayes score and time (older content falls off organically) to offer a better picture of the most interesting posts on your blog at any given time.

Download WP-Veracity from Wordpress.org




For developers interested in creating their own plugins using Swift Web Services, visit our documentation wiki.

SwiftRiver Web Services Launches



The SwiftRiver Web Services platform offers RESTful apps that live in the cloud that we encourage other developers or applications to utilize. These services are diverse and powerful ways to improve data collection and management.

For non-profits and NGOs working in the field who may be worried about connectivity or security, all SWS Apps are also open source which means they can be run on your own servers or completely offline.

The first of these web services available is OpenSiLCC. OpenSiLCC allows users to parse and categorize any text on the fly. We are also developing open source applications which exemplify use. They’re potential building blocks for your ideas with code to help get you started. One of them, Abraxas is live and can be found here. Get the Abraxas code.

To sign up, visit http://sws.ushahidi.com. What are some use potential use-cases for OpenSiLCC?


  • SMS messages coming from Frontline or Clickatell could be tagged and categorized in real-time.

  • Users could aggregate non-tagged data (say from Twitter), parse, and output feeds with tags.

  • Develop your own glossaries and text parsers for content unique to your organization (or language).

  • Identify relationships between seemingly disparate message types (email, sms, twitter).



Sign Up For Web Services



Read the related post “Taxonomy for Text Messages”.

The next version of SwiftRiver (0.2.0 Batuque) will ship with these services (OpenSiLCC and others) fully integrated.

Taxonomy for Text Messages

Getting crowd-sourced information into a system is only the first hurdle, the next is managing it

Getting crowd-sourced information into a system is only the first hurdle, the next is managing it. Last week we announced Swift Web Services, RESTful applications hosted in the cloud, that any third-party application or developer can use to assist in managing data. One of those services is SiLCC, a semantic tag extraction service for parsing text and extracting relevant keywords from Tweets and Text Messages. Tags like the names of people and places, actions that need to be taken or locations where things have occurred. It’s is an open service that we host on our servers, meaning anyone can use it in their applications. It will work with Word Press, Drupal, Frontline SMS, other aggregators like Managing News and more.

These other applications would send the SiLCC api a feed of content they want tagged, it then extracts keywords and returns a feed of tags linked to the content they refer to. From there they go on to be used however the original app developers decide.

tufts

For many organizations, this is a critical time saver. It saves humans the time from having to comb through a system to find useful content. Aggregating content in an Ushahidi instance that uses SiLCC or in SwiftRiver would allow bypass that manual sorting, allowing users to focus on verifying reports and responding to urgent requests.

Tags are the first, autonomous layer of taxonomy for content. They won’t be the only layer, but if you’re monitoring 100 different mobile phones sending in messages referring to volcanic eruption in Iceland, but you’re looking for the ten that reference one particular cancelled flight, this is one of the quickest ways to couple disparate items.

280 Characters or Less

A number of services are out there that offer similar functionality, in fact we recently partnered with Thomson Reuters who offers a service called Open Calais which extracts semantic keywords from articles and blogs. Where Open Calais doesn’t work so well is with shorter messages that are less than a paragraph in length. For managing information from mobile phone users, this is a problem because that content falls well below the threshold of Open Calais. So our partnership allows their service to supplement ours and vice-versa.

Active Mobile

SiLCC does one thing in particular differently than many apps out there that might be similar. Rather than exist as service that has to be improved by the developers (us) we’ve incorporated active learning techniques that allow it to learn autonomously. This is because we don’t know where or when the next crisis that needs to be monitored will occur. We don’t know who will set up the next SwiftRiver instance or what they’ll use it for. So we designed SiLCC to adapt to any and all scenarios by learning from the instance of use, rather than the top-down approach of tweaking the app on demand. This is known as persistent tagging. SiLCC auto-tags content, but also self-improves and accumulates knowledge (rather, conditions that it can use to improve future decisions).

Natural language processing geeks will wonder if they can define their own corpora and add words specific to their organization or event directly to SiLCC? Of course, this saves time and also improves performance. Additionally, by default we’ve included corpora for dealing with Twitter ontology as well as the TXTSPK (text speak) commonly used by mobile phone users.

Secret Ontology

Finally, the fact that we can predefine corpora, gives organizations the option of setting up codes for people utilize the system remotely. For instance, we could customize an Ushahidi instance to automatically verify and map any text message that contains a unique string (example “Help trapped in Port-au-prince Market #a1u9”). That tailing string of alphanumeric characters is like a password that tells the system to do something. An organization could set up these unique character strings and functions, giving them only to people they send to the field. In the event of an emergency, that person could communicate with HQ in ways that the other users of the system couldn’t. We have other apps for auto-detecting location, which makes it simple to extract that data as well. Rather than take a laptop into the field to map data, an organization could set up a specific set of keywords that represent locations or events. Then workers, armed only with phones with SMS functionality could use the system remotely.

This isn’t why we designed the app, and I doubt many orgs will use it this way, but I think it makes for an interesting possible extension of the Ushahidi platform. A more common use will probably be differentiation between actionable (someone needs something done now) and non-actionable reports (nothing needs to be done) for emergency response organizations.




We announced our alpha of SiLCC last week. If you’re interested in applying to be an alpha tester, click here. SiLCC is open source, so if you’d like to contribute to the project as a developer, follow the project on GitHub.

Uganda’s Victor Miclovich talks Machine Learning

If you were there or following South by South West yesterday, you may have heard some chatter on Twitter about the Africa 3.0 talk by Teddy Ruge of Project Diaspora. In his panel he used Skype video to chat in real time with software developers and incubators in Cameroon, Kenya and with my staff in Uganda. Two of the developers from Appfrica, Moses Mugisha and Victor Miclovich appeared with me on camera to speak with the crowd. One of them, Victor, quickly discussed his natural language processing project SiLCC. Here’s a quick interview allowing him more time to explain his background, how he got into semantic programming and why peer learning is critical.

SXSW Africa 3.0 Panel




In the post Natural Language Processing with Swift River I introduced you to two underlying technologies powering Swiftriver. Victor Miclovich is the Ugandan volunteer developer who’s spent the last few months working to help make these plans reality with SiLCC (Swift Language Computation Core).

[caption id=”” align=”alignnone” width=”240” caption=”Victor Miclovich, SiLCC Developer and Volunteer”]Victor Miclovich, SiLCC Developer and Volunteer[/caption]

How did you get involved with natural language processing technologies? It’s not a field many Africans are known to be active in.

Victor Miclovich: When I got hooked up with writing code, I discovered another side of computing as a kid. That side of computing led me to doing heavy research work and this fired up my inquisitiveness.

NLP wasn’t what I played with first. I started with doing work inside of artificial intelligence which surprisingly had a likeliness to programming. As I matured in the area, I realized that one would never really master everything in A.I. (artificial intelligence) and so I narrowed my work to machine learning which was about 2 years ago.

Machine learning is a wide subject with lots of literature and research work being done in many areas from computer vision, speech recognition and natural language processing…the list is actually endless. I settled for computer vision work and NLP eventually because of their feasibility and ease of access to technology in Africa. I knew that getting a robot built could be a little bit hard! (laughs) That’s how I got involved with NLP technologies; my curiosity drove me to it.

What inspires you as a software developer?

VM: First, it is my drive and passion for technology. Being able to instruct a machine to do your bidding is something that brings a sense of fulfillment. People don’t always follow my instructions.

Secondly, the people (developers) I encounter wherever I work and go bring inspiration to me…this is just my way of saying that Appfricans are my inspiration…their accomplishments and determination is what keeps me going.

How do you see Africa’s role in tech changing over the next 20 years?

VM: Africa’s role in tech is slowly becoming visible. Universities in Africa are slowly churning up new grads every year. These grads have ambition and are tired of staying behind technology. This is what is going to drive the change in tech.

When students or people get tired of being behind, they develop a strong desire for change…we should not be pessimistic about this, we are optimistic! There are many floating examples all over Africa of tech communities and start-ups sprouting up.

You’re very involved in the community and helping the guys coming up behind you, giving gratis lectures and workshops at your university and mentoring your peers in your spare time. Why do you feel this is important?

VM: It is always important to give back to mankind. Philanthropy has it’s rewards. I feel that if I don’t do something, those are years lost to the community. I have lived in a place where I’ve seen folks with lots of potential and those that have made it in life and science (or tech). Many stay arrogant and don’t give back to the community…they end up living lavish lives with lots of wealth and of course, who else will suffer? The community will. It suffers because those well off folks only do things that will help themselves.

On the other side of things, my giving back to the community helps make more folks like me or even better than me. This means that we shall get thinkers rising exponentially and an increase of great ideas that won’t end up being recursively boring but wonderful! These are the main reasons I feel what I’m doing is important.

How has it been working with the global developer community? Have you learned a lot?

VM: Working with a global developer community has been very interesting. I’ve virtually met folks that have done cool stuff with their time and this has been quite inspiring. It has boosted the quality of the work that I do because of the huge amounts I learn from my peers in the global dev community.

You can follow Victor’s work on SiLCC here or on the Swift River mailing-list.