Big Data

Knowledgent has a deep heritage in helping organizations across all industries utilize Big Data technology to further their Data and Analytics initiatives and derive critical insights for the business.

Knowledgent is an AWS Advanced Consulting Partner that is focused on innovating in and through data to impact your business using the latest Big Data Technology.

Knowledgent and its team of more than 300 Informationists is a pure-play data intelligence company with a focus on Healthcare, Life Sciences, Financial Services, and Commercial industries.

Knowledgent offers services across Strategy, Architecture, Data Pipeline/ETL, Analytics, Visualization, Cataloging, and Governance. We are passionate about data and analytics, it’s all we do.

Healthcare Case Study

Knowledgent set out to build a platform to be used by business managers for oversight of physician behavior and practices. This included identifying “good” and “bad” actors among their physicians of interest based on the billing claims. The goal was to have the identifications based on informed probabilistic analysis of real-world data.

Vice President Clinical Analytics, Healthfirst

“As a large regional plan serving one in eight New Yorkers, using information in new ways is essential to improving the quality of care our members receive, and the cost-effectiveness of that care. Generating analytic insights using an increasingly diverse and exponentially expanding set of data inputs is key to getting those hard-to-reach benefits that can make a difference to our business and to our members. We use a variety of AWS services to pursue AI/ML-based analytic outcomes affecting member retention and acquisition, risk-modeling and fraud detection. Knowledgent has been a critical partner. They brought both architectural vision and boots-on-the-ground leadership to this effort, which is a major strategic initiative for us.”

AWS Big Data Lake

  • The Claims, Membership and Provider information consisting of structured and semi-structured data was ingested from on-premises and partner systems into the AWS Data Lake.

  • The data lake was established on S3 across multiple zones – Landing, Raw, Refined, and Published

  • The processed data was published into Redshift and RDS for downstream reporting and analysis

  • Technologies Used: S3, EC2s, On Demand Spark/EMR clusters with Spot instances , Cloud Formation, DMS, Data Pipeline, Step Functions, Glue, SNS, Lambda, Redshift/Spectrum,  RDS Athena etc.

Cost Optimization

  • The ingested files were processed on the AWS Datalake using serverless architecture using transient Spark/EMR clusters with data residing on S3 with EMRFS

Security

  • End-to-end security was configured including network security with VPC, SG, NACLs, IAM Roles and Policies and CloudWatch/CloudTrail for Monitoring

  • Data was protected by encrypting data at rest in S3, RDS, Cloudera Hadoop Clusters with local KMS and in transit with TLS

Life Sciences Case Study

Knowledgent enabled the ingestion of Healthcare Claims, Electronic Medical Records, Lab Information, and Clinical Trials data for this major pharmaceutical company.

Director, Data Analytics, Takeda

“Takeda and Knowledgent are co-presenting at the Data Summit in Boston to review the deployment of an advanced Real World Evidence analytics platform. The platform is based on an Amazon Web Services infrastructure, and leveraged a series of technologies including S3, Redshift, Lambda, Informatica EIC and Tableau. Knowledgent enabled ingestion of Healthcare Claims, Electronic Medical Records, Lab Information and Clinical Trials data, mapped those data sets to the OMOP industry standard, and enabled analytics and visualizations for Patient Journey, Treatment Pathways, Drug Utilization and Switching Analysis. This platform will enable Takeda to perform advanced patient analytics, at scale, to discover, develop and deliver life enhancing medications.”

AWS Big Data Lake

  • The real world evidence data is first transformed into industry standard OMOP model using transient Spark/EMR clusters using Spot instances to keep costs low. 100 plus TB’s of data was processed using clusters of up to 4800 cores.

  • The standardized data is then loaded into a Clinical Data lake implemented using Cloudera on EC2 instances.

  • The clinical datasets are cataloged in an enterprise data catalog implemented using a fleet of EC2 instances and using a separate data catalog Hadoop cluster

  • The processed data is then available for reporting and analysis to Business SMEs and Data Scientists enabling analytics and visualizations for Patient Journey, Treatment Pathways, Drug Utilization and Switching Analysis

  • Technologies Used: S3, On Demand Spark/EMR clusters with Spot instances, Data Pipeline, EFS, Lambda, Persistent Cloudera Big Data Clusters implemented on EC2, Informatica Enterprise Data Catalog on a secondary Cloudera Hadoop cluster hosted on EC2s, and RDS.

Cost Optimization

  • The ingested files were processed on the AWS Datalake using serverless architecture using transient Spark/EMR clusters with data residing on S3 with EMRFS

Security

  • End-to-end security was configured including network security with VPC, SG, NACLs, IAM Roles and Policies and CloudWatch/CloudTrail for Monitoring

  • Data was protected by encrypting data at rest in S3, RDS, Cloudera Hadoop Clusters with local KMS and in transit with TLS

Never miss an update:

Subscribe to our newsletter!

Newsletter Sign Up Form

  • This field is for validation purposes and should be left unchanged.
 

TwitterFacebookYouTubeGoogle+Instagram
New York, NY • Warren, NJ • Boston, MA • Toronto, Canada
©2018 Knowledgent Group Inc. All rights reserved.