Christian Kreutz explores the many technologies the the world is using to make sense of real world data in the digital domain. These technologies, apart and collectively, enable computers to more accurately interpret the world as we understand it. In the hopes that they’ll be able to tell us more about our reality than we are able to infer unaided.
Our relationship with these technologies is self-reinforcing, it’s both driven by, and the cause of, an explosion of the ‘sharing’ of content. In other words, the more data we have, the more we want to understand and contextualize it. The more we understand, the greater the motivation to create and share even more.
The Information Age, Amplified
Eric Schmidt, CEO of Google, recently talked about just how fast humans are creating content:
Thanks to the Internet, we now double every two days all stored information. The estimated amount is 5 exabytes according to Eric Schmidt (Google) and it took human kind 2000 years to get a similar amount of archived information.
So how are machines able to parse all this data from the real-world? Well, there are a few ways…
- Text Recognition and Natural Language Processing
- Voice Recognition
- Mobile Data Collection
- Image Processing and Computer Vision
That’s a few, but also consider a number of other technologies, programs for mining the social graph, mapping, checking-in, active learning…too many to list. The point is, the sum of these parts allows for platforms that attempt to understand media as close to the way humans do as possible. Of course, the benefit of computing is that algorithms work faster and more efficiently than we do. Despite the number technologies listed above, artificial intelligence isn’t quite where it needs to be to completely automate managing it all.
Just today there were reports that Cuil, a search engine that relied upon semantic parsing algorithms to mine the dark web, might be shutting down. I’m sure their technology was sound and some of the brightest minds in the business started Cuil, but there are real difficulties in relying on machines to do complex tasks where context is the variable.
Crowdsource the Filter
Our approach is to address the problem from a different angle, where humans can distribute work to many, use machines to aggregate the output of that productivity, and then work with smart tools that learn from the users needs and expectations. If our code isn’t smart enough to make sense of data on it’s own (it’s not) but humans are (yet they aren’t as fast or organized), then perhaps part of the solution lies in optimizing human efforts at filtering content, adding context and using the result as the base for improving future algorithmic decisions. This is called active learning, where the interactions of a human operator improves algorithms assigned to perform certain functions.
My colleague Patrick Meier refers to this as Crowdsourcing the Filter. I think at least in the near term, this is the future of intelligent computing, where smart machines assist humans, helping to us to accomplish the tasks we need to accomplish more efficaciously.
At CrowdConf next month on October 4th, SwiftRiver will be onsite demonstrating some of the applications we’ve built from this understanding. This is part of our approach to solving the problem of ‘too much data’. We’ll let the big guys like Google, Microsoft and IBM figure out the secrets to scalable a.i. In the mean time, our goal at SwiftRiver is to democratize access to tools that help people make sense of data, on their terms.