By utilizing Hadoop, Servient centralizes large amounts of unstructured data and allows for the management of the data and development of enterprise taxonomies. Information Technology departments can reduce costs and improve efficiencies by managing one repository that is available to every business group for the development of business opportunities.
The solution enables several sophisticated workflows that combine search with unsupervised and supervised machine learning algorithms running on the Hadoop platform allowing the user to visualize and interact with the data in order to search and categorize. It allows the user to embed human knowledge about enterprise taxonomies into enterprise wide machine learning models that continually operate on incoming data. It adapts to evolving taxonomies and applies the changes retroactively. The solution seamlessly connects with Servient's compliance and e-Discovery applications.
Servient's underlying data model is open. Organizations are not bound to Servient for access to their data. Data is readily accessible and allows third party applications to be plugged in and extract value from the data.
Servient uses a "schema on read" data model which allows users to store data in its raw form. The data can adapt to the users evolving taxonomies to meet the needs of their applications. IT departments are not burdened with the need to migrate the data as the data model evolves.
The Hadoop ecosystem is the foundation for running machine learning at scale.
Hadoop delivers the distributed parallel processing required for Servient's computationally intensive, sophisticated machine learning at scale. Hadoop provides the computational power required for Servient's machine learning which analyzes every document in the data set. These machine learning tasks require text processing for parts of speech tagging, dimensionality reduction, establishing semantic relatedness between words as well as between documents, identifying email conversations and more.
In addition to computational power, Hadoop also provides cost control by allowing for elasticity in its cluster capacity. To accommodate the capacity of the machine learning, Servient "pumps-up" the capacity of the Hadoop cluster. Once the processing is completed the capacity is returned to base capacity. This elasticity of capacity allows for machine learning at a controlled cost level.
The Servient Solution allows for embedding human knowledge into machine learning models to organize the data into enterprise taxonomies. The models continually operate on incoming as well as legacy data and retroactively adapt to the changes in existing taxonomies. The models also generate alerts if the taxonomy is not sufficient to cover the data thereby prompting the user to extend the taxonomy.
Management of ACLs becomes easy with Servient's underlying data model. Servient leverages the security and permissions management features available in HBase. The ACL compliance is deeply embedded within Servient's Knowledge Discovery Archive. Servient's advanced search and machine learning algorithms enable knowledge discovery workflows that honor ACLs.