An experienced Data Scientist specializing in Oncology and Immunology therapeutic areas, who is committed to delivering valuable insights and solutions to large and medium-sized pharmaceutical companies. They have expertly led initiatives such as constructing Feature Engineering pipelines in PySpark, empowering organizations to derive deeper insights from complex datasets efficiently. This consultant has extensive knowledge of pharmaceutical data sources to help navigate and derive meaningful insights from diverse datasets to you as a client.

Professional Experience

Management Consulting FirmNew Jersey (Oct 2020 – Present)
Project Lead, Data Sciences
⦿ Refine client’s promotional practices by building Feature Engineering pipelines in PySpark for Omnichannel ML models to predict HCP prescribing behavior.
⦿ Lead the development of brand/data agnostic A/B Testing pipeline to assess effectiveness of marketing campaigns on new prescriptions and patient enrollment in drug assistance programs; automate one-one and one-many matching using nearest neighbor, propensity scores, and error variance algorithms in Python.
⦿ Developed an automated Analytics on Demand system in SQL, Python and Linux to improve ingestion of key HEOR metrics, and chart patient journeys in APLD claims data.
⦿ Formulate and refine business rules for Regimens, Lines of Therapy, and Source of Business assignments in Oncology.
⦿ Support client’s Specialty Data Management team by executing anomaly detection workflows using Python in Dataiku; facilitate data refreshes from third-party data vendors with APLD claims data and Customer Master (Physician data).
⦿ Integrate multiple data sources (IQVIA claims, Customer Master) for complete physician information for commercial targeting.

Financial Firm – Minneapolis, MN (Jul 2019 – Aug 2020)
Data/Technology Analyst
⦿ Reduced outage by 25% by predicting P2 (Priority 2) incidents likely to become P1 with 86% accuracy using Neural Network.
⦿ Executed ANOVA and Fisher’s tests to highlight and closely monitor business functions with consistently failed processes.
⦿ Standardized Service Level Objective for ~300 Technology Catalogs by exploring historical requests.
⦿ Reduced forecasted backlog by 8% for the ServiceNow team by executing anomaly detection in Tableau to identify aging requests.

Analytics Lab – Minneapolis, MN  (Jul 2018 – May 2019)
Data Science Consultant
Client: Leading Education Non-profit
⦿ Influenced efficient resource allocation by designing A/B testing pilot to assess the performance of 1600 high-ranking students in a low-touch coaching program.
⦿ Identified students in need of special assistance with college applications using predictive modeling (SVM; 91% accuracy).

Client: Hospitality and Entertainment Firm
⦿ Increased annual revenue by $300k+ by predicting bi-weekly demand of hotel occupancy (Random Forest).
⦿ Addressed declining occupancy rates by identifying customer segments for targeted advertisement and hotel discounts.

Client: Mall of America
⦿ Increased potential for cross-selling at gift stores by ~$5 per product through inventory optimization (Market Basket Analysis)

Non-Profit Organization – Mumbai, India  (Aug 2017 – Mar 2018)
Data Analyst and Technical Writer

⦿ Informed people about commonly perpetrated sexual crimes in communal places using Association Rules Mining.
⦿ Identified steps to curb sexual harassment by analyzing incidents at railway stations using text analytics and data visualization.
⦿ Explored Twitter data for the #Metoo campaign to identify patterns of gender-based violence using text analytics.

Management Consulting Firm – Bangalore, India (Oct 2015 – May 2017)
Decision Scientist

⦿ Drove decisions for Directors of different teams from exploratory and predictive analyses on medical claims (Anonymous Patient Level Data) using Teradata SQL and SAS.
⦿ Increased visibility of the client’s HIV drug among healthcare providers by executing analysis for an abstract publication and poster, presented at the Academy of Managed Care Pharmacy in 2017.
⦿ Supplemented information for future clinical trials of the client’s skin cancer drug by analyzing its adherence and switching rates in different Lines of Therapy.
⦿ Influenced payers to give preference to the client’s anticoagulant drug over competitors in their drug formularies by comparing their bleeding event rates (A/B testing).
⦿ Improved insurance support for the client’s Renal Cell Cancer drug by executing cluster analysis on Electronic Medical Records.


This image has an empty alt attribute; its file name is Divider-1.jpg

Real-time Dashboard
Analyzed real-time tweets about live events using Apache Kafka and Spark’s structured streaming.

Deep Scalable Recommender System
Developed two recommender systems for movie recommendations using collaborative filtering in AWS Sagemaker and DSSTNE, depending on varying problem complexity and data scalability.

Technical skills

Tools: SQL, R, Python, PySpark, SAS, Tableau, AWS, Hadoop, Hive, Dataiku, Linux
Techniques: Feature Engineering, Machine Learning, Exploratory Data Analysis, Causal Inference (A/B Testing), Statistics, Anomaly Detection, Data Engineering, Business Intelligence, Data Visualization


Master of Science – Business Analytics
University Of Minnesota, Carlson School of Management – Minneapolis, MN

Bachelor of Technology, Computer Science and Engineering
National Institute of Technology – Bhopal, India

Get The Latest Updates

Subscribe To Our Monthly Newsletter

No spam, only the content you’ll want to read.

Details about how we process your Information is available in our 

Privacy Policy


Come see KMK at PMRC

February 7-8, 2024

Newark Liberty International Airport Marriott Hotel, New Jersey
Dale Choi

Director, Strategy and Research

Andrew Pilecki

Director, Strategy & Research