Another great post by Jeff Nolan (surprise, surprise). There is no question that there is a great interest in mining unstructured data, but its a vast space in terms of breadth and depth. For example, just because one can do text search across e-mails, PDFs, PPTs does not mean that one can easily facilitate searching audio, video, and other richer media types. More importantly, there are massive privacy implications for this type of unstructured data mining. Do you want your manager knowing what web pages you visited, what PDFs you read, what audio and video you downloaded? How do I separate that which I want indexed versus that which I do not?
More importantly, I think that unstructured data will have to be searched with structured data using some kind of unified metaphor. Yes, it's true that I have a hard time finding the right e-mail, PDF, PPT, etc., but what's even more frustrating is that I can't link these to existing corporate information assets, like my CRM system, or ERP system, or BPM system. I'm still waiting for the company that understands that what is necessary is an information model that can span structured data like RDBMSs, Multidimensional Databases, and XML data stores and can model the relationships between these entities and unstructured data. When I want a 360 degree view of the customer, I want one information system that understands the relationship between the customers service requests, open opportunities, e-mail correspondence, sales presentations, etc., and can give me a real-time barometer of the health of my relationship with them and more importantly what I can do to improve my relationship with them down the line.
Companies would be willing to pay A LOT for this kind of technology. Maybe I can get SAP Ventures to fund my next startup idea. :)
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment