By Kalwinder Kaur
Diffbot, a developer of visual learning robot technology, has achieved a $2 M investment from technology experts such as EarthLink’s Sky Dayton; YouSendIt CEO, Brad Garlinghouse; Sun Microsystems co-founder, Andy Bechtolsheim; Director of the MIT Media Lab, Joi Ito in addition to executives and founders from Twitter, Facebook, and Yahoo with Matrix Partners participation.
Diffbot is the latest visual-based content extraction technology that can visually comprehend the Web content just like humans. Based on machine learning, artificial intelligence, computer vision and natural language processing, the technology detects and extracts the key objects in the Web page. Diffbot’s APIs enable application developers to immediately incorporate Web page data in their own applications. The entire Web will therefore be converted into a usable database. Diffbot currently processes 100 M API calls per month for its customers, thereby supporting their tag generation, Web site mobilization, article grouping/clustering, content management system migration, and various other functions.
Diffbot has classified the Web into 20 different page types, wherein layout and contextual cues enable the visual analysis of products, recipes, review pages and social networking profiles. Using visual-based processing, Diffbot can comprehend and extract the content on any page instantly in all languages. The company has launched developer APIs for Articles and Front Pages so far. Common layout markers such as headlines, images, ads, bylines, articles, enable designing the Front Page API for analyzing home and index pages. The Article API extracts clear article text, related videos and images followed by creating unique cross-referenced tags from blog Web and news pages.