YAVA Data Management Platform is an open source compilation platform that provides a big data management environment with management and monitoring of Hadoop cluster. A combination between the power of Apache Hadoop ecosystem and ease of use, YAVA designed to help accelerate the adoption of Hadoop implementation.

All managed data will be stored in a distributed file system known as HDFS. Reliability and linear scalability of HDFS provides storage for data with a variety of formats. And with the support from YARN cluster management, YAVA provides a wide range of data access and processing capabilities: batch, streaming, interactive and real time in a single cluster.  To empower developers and designers to develop big data process and job, HGrid247 provide a visual tools, which eliminate programming or scripting.

What’s Included
Distributed File System HDFS
Cluster Management Ambari
Data Governance Atlas, Falcon
Security Knox, Ranger
Workload Management Yarn, Zookeeper, Oozie, Slider
Data Processing & Analytics Frameworks MapReduce, Spark, Storm, Tez
Machine Learning Mahout, MLib, SparkR
Online NoSQL HBase, Accumulo
Analytics SQL Hive, Phoenix, SparkSQL
Search & Indexing Solr
Data Flow Flume, Sqoop, Kafka
Visual Workflow Designer HGrid247


Community Version

Everything you need to start
your endeavour in the big data world.

Enterprise Version

All in one solution to answer
your enterprise data problem.



Store, search, explore and secure any kind of electronic document in a document management system with unlimited scalability

With the fast growing numbers of documents used in your offices, managing documents is becoming a huge challenge. Especially when documents are still in hard-copy formats. The impact is obviously in the cost. Cost of storage room, risk of losing or leaking information requires proper access and the number of resources required in finding a certain evidence timely as for example an audit request asking for a case from certain years back.

Document Management System (DMS) is inevitable. And very logical since all these documents were created electronically. Now, you can have audit trails of people looking into a certain document at a certain time, indexing of documents by category helps searching and your data is in your fingertips whenever you need them. However, DMS application has its limit: big volume, various type of non-text “documents” and storage expansion requires down time since data now not only generated within your offices but also rapidly comes from emails.

YAVA is supported with DMS features handling documents in the size of Big Data with Chanthel. Chanthel runs capability of Searching with Apache Solr. Equipped with Optical Character Recognition (OCR) technology, Chanthel recognizes text from images or scanned documents and automatically converts it into editable text format file.

Key Benefits include:
  • Scalability of storage is unlimited given the addition of nodes are not affecting the DMS application capability which means zero down time.
  • Parallel Access enables searching any kind of data as long as it has key value metadata which is stored in YAVA platform having Open Technology.
  • Interoperability in YAVA platform allows Chanthel works with different kind of data or servers used in office network including email servers.
  • Analytics on Data at-rest captured for knowledge and insights benefitting data-driven business.


Easy integration with medical devices and manage all types of medical information.

Compared to public domain areas such as Banking or Travel industries, adoption of IT in HealthCare has not progressed well in the care services. The focus was more in the administration such as registration or payment. Recently, standardization such as HL7 and open technology such as smartphones has geared up HealthCare IT towards better care for patients and easy access for clinicians.

Electronic Medical Records (EMR)

Healthcare such as EMR is a prime example of how the three Vs of data, velocity (speed of generation of data), variety, and volume, are an innate aspect of Big Data. This data is spread among multiple healthcare systems, health insurers, researchers, government entities, and so forth. Furthermore, each of these data repositories is silo and inherently incapable of providing a platform for global data transparency.

YAVA offers a solution for EMR interchange which is interoperable with a broad ecosystem such as data lake. Moving data from and to YAVA – to other systems such as other hospitals or research institutions. YAVA provides reliability & flexibility proven with Hadoop in fast, effective and efficient large-scale data processing. Interoperability in YAVA will allow you to reduce cost and effort, while preserving investment in your IT architecture.

Picture Archiving & Communications System (PACS)

Today, Picture Archiving and Communication Systems (PACS), which provides proprietary data formats and objects, is the industry standard for storage and retrieval of medical imaging files. However, despite the advent of PACS technology, the images captured and gathered from patients remains vastly underutilized, stored not undergoing further processing. A compilation of these images could otherwise be used for analytics useful in research to gather insight from information over a large timescale.

PACS in YAVA using Apache Hadoop and Apache Solr embraces machine learning and deep learning performing analytics of higher-resolution 2D, 3D, 4D, and microscopic images on a Hadoop cluster distributed image processing. Combining insights from analytics of medical images with information from the patient’s EMR will offer tremendous support to science-driven health care in the near future.


Capture full content and context of network traffic and make faster threat analysis with streaming analytics

With the evolution of networks, threats or attacks with the intention of disrupting service or stealing confidential data are increasing tremendously. Threats to the integrity of a network produce challenges for the integrity and operational capability as well as the cost involved with operating and maintaining it.

First, to detect or prevent attacks is to analyze the connections in the network, data being transmitted over the network and the type of requests being made. If the network is of small size it would not be a difficult task to constantly monitor and analyze the network but in case of large networks it would be very difficult to carry out the analysis and get the metrics related to connections, requests and type of data being transmitted so as to protect the network from zero-day attacks.

YAVA approach detects anomalous activities and malicious data being transmitted over the networks through processing, loading traffic data and analyzing using Kafka and Storm in Hadoop Distribution File System (HDFS) environment. As for the visualization using Solr and Banana. The results of using this method to detect attacks on the sample dataset are also presented.


Get ahead of the competition by harnessing media analytics in your decision making


© 2017 Labs247. All rights reserved.
YAVA logo and HGrid247 logo are registered trademarks or trademarks of the Labs247 Company.
HADOOP, the Hadoop Elephant Logo, Apache, Flume, Ambari, Yarn, Bigtop, Phoenix, Hive, Tez, Oozie, HBase, Mahout, Pig, Solr, Storm, Spark, Sqoop, Impala, and ZooKeeper are registered trademarks or trademarks of the Apache Software Foundation.