What is Mojo?
Everybody talks about big data with many uses starting with predictive analytics of this big data. What you never hear is how do we create usable big data and that is PaperClip Mojo, where big data begins.
PaperClip Mojo is a new Platform as a Service born in the Cloud to engage Crowd Sourcing for Big Data processing. The ability to transcribe, translate and interpretation of Big Data faster and more accurate than ever before is a killer app. Cloud computing turns what took hours and days into seconds and minutes. Crowd Sourcing reaches a global workforce for accuracy and new capabilities never experienced before.
The promise to transform handwriting from paper to usable data by technology has never been achieved and never will. Large companies today hire off shore organizations leveraging their inexpensive labor pool providing 24 hour turnarounds.
New opportunities focus on multi-media interpretation into data allowing information never seen before used as a competitive advantage. Imagine now Realtors can send pictures to a Crowd Source group to capture room color, type of floor, how many windows, kitchen sink style and more. Working this data with predictive analytics can now show which homes may sell quicker and for more money.
The transcription of a 60 minute audio file could be returned in-less than a minute in multiple languages. Retailors could make buying decisions based on video captured hours earlier showing what people were wearing that day.
Mojo is a combination of the best technology can provide matched with the best recognition engine ever, the human. Combining Crowd Sourcing with PaperClip’s imaging experience and innovation history will provide the least expensive handwriting to data service meeting business needs and compliance today. In the future, we look forward to working with many industries and show them the unique use of Mojo and how it can make a difference.
Patent Pending 2015
The Cloud is the best platform for processing big data that is contiguous in format and the results are consistent and repeatable. Contiguous formats are images, audio, video and live streaming feeds. These formats can be segmented into smaller items called SnipIts, cataloged and be reassembled. Consistent and repeatable results define technology and people interactions with the SnipIt will produce the same results.
The Mojo service is designed taking full advantage of the cloud architecture and services. The ability to create processing instances on demand to identify an incoming document of 100 pages and having every page recognized at the same time, that’s power. The ability to create 100 SnipIts from that document and queue for the next available Clipper.
Microsoft Azure is a nature fit with PaperClip’s Platform as a Service design. The ability to build and connect to reusable components provides a platform for service expansion and the most economical use of the cloud.
Crowd Sourcing is the next game changer, the ability to engage the entire planet. Technology has reached its limits and the results are not useable. Optical Character Recognition (OCR) cannot read handwriting effectively; someone always has to clean up the mess. Speech to Text solutions don’t excel very well with third party audio either. There are just somethings best left to people.
People can recognize handwriting to 98.5% accuracy on a single pass and achieve 99.9% with a blind pass (2nd Clipper does not see the 1st Clipper’s characters) or validation pass (2nd Clipper sees both SnipIt and results and agrees or repairs). The same techniques can be applied to other formats whether audio or video.
Mojo is designed to address two types of Crowd Sourcing; Enterprise and Commercial Clippers. Enterprise Clippers are organizations (i.e. internal staff, outsourcing companies) which employ and manage the workforce as a single entity. Commercial Clippers are freelance independent contractors engaged in item work for hire. The user experience is different between the two groups. The Enterprise Clipper has work workspace with very simple performance metrics. The Commercial Clipper has more of a gaming effect; monetary performance, competitions and instant cash out to authorize vendors.
PaperClip Mojo is designed to Bring Your Own Crowd (BYOC). Subscribers will have the option to create and manage their own Enterprise Clippers, contract with PaperClip’s Enterprise Group and/or leverage the Commercial Clippers. The Mojo platform will create the largest and most diverse population of translation specialist, subject matter experts, around the clock service on the planet. The ability to service the globe is incredible.
There are three primary uses of Mojo; transcription, translation and interpretive action on big data. The ability to approach big data with thousands of small special actions simultaneously creates turnaround times (TAT) never seen before. Processing thousands of forms, freeform pages, hours of audio and video within seconds is where big data begins.
Forms Processing Use Case
To receive a 10 page application with 50 fields to data to collect takes 30 seconds TAT. To receive a 100 page application with 500 fields to data to collect takes 30 seconds TAT.
Free-form Pages Use Case
To receive a 20 page handwritten manuscript written in Chinese to English text takes 30 seconds TAT. To receive a 200 page handwritten manuscript written in German to English text takes 30 seconds TAT.
Audio Use Case
To receive a 3 minute audio clip and process it to text takes 30 seconds TAT. To receive a 3 hour audio clip and process it to text takes 30 seconds TAT.
Interpretive Use Case
To receive thousands of pictures focused on real estate and ask questions of interior detail (i.e., color of the paint in the living room, type of flooring in the kitchen) takes 30 seconds TAT.
Mojo is designed to process big data to a consistent TAT no matter its size. This is the magic of cloud computing and crowd sourcing.
Mojo addresses security in two ways, content to Clippers and context in storage. In distributing thousands of SnipIts to thousands of Clippers, the key here is to remove all contexts to the SnipIt itself. A SnipIt may present the image of “John“ to a Clipper, the Clipper has no knowledge of the source. The SnipIt could be from a job application, parking ticket, retail order form, insurance application and other sources. Second is to take sensitive fields (i.e. SSN, Policy Number, Driver License, etc.) and slice them into smaller SnipIt whereby the entire value is never presented. Let take SSN as our example, factored in half, one Clipper will see an image of “12345” while the second Clipper will see an image of “56789”. To the Clipper these are just numbers, possible address, phone number, account, birth date or others.
Mojo stores the data just as it was distributed, no contexts. SnipIts once processed are stored in an optimum scheme reference only by its ciphered key. This Shredded Data Storage (SDS) model provides perfect secrecy whereby if obtained by nefarious means, the data would be useless. Taking a database by itself would have no value. Criminals themselves would have to process the data out and that Mojo can watch for and stop. The days of copying data out without any notice are over; Mojo’s SDS now makes its data “Shredded at Rest”.