Proactive data analyst consultant with a strategic mindset and a track record of delivering results. Demonstrates expertise in maintaining and updating Datamart, data visualization, and diversification analysis. Proficient in outlining patient journey frameworks, providing strategic insights along with disease progression statistics.
KMK Consulting Inc. Morristown, NJ (Jul 2019 – Present)
Senior Data Analyst Consultant
⦿ Maintained and updated Datamart of rare disease product by aggregating data from multiple data sources such as Special Pharmacy, Special Distributor etc. using SAS and Alteryx.
⦿ Developed and modified reports including daily Datamart report, KPI report etc. based on Excel and SAS.
⦿ Manipulated CMS open payment data and generated report regarding market trend and competitors’ analysis.
⦿ Generated data visualization of different business metrics to help marketing and sales teams to make clear decisions and target clients.
⦿ Derived business insights and analytical solutions by performing diversification analysis on Patient Claims, Payer Medical Claims, Physician data using Python/SQL/Alteryx.
⦿ Outlined patient journey framework to identify specific treatment events and strategic insights along with disease progression statistics of patients to facilitate improved quality and safety of patient care.
⦿ Worked on ad hoc reports requested by client to compare market landscape.
Manufacturing Firm, Bethlehem, PA (Dec 2018 – May 2019)
Data Analyst Consultant
⦿ Processed abnormal data and restructured data for predictive analytics using MySQL and Excel
⦿ Utilized data segmentation by applying coefficient of variance and seasonal factor on K means clustering.
⦿ Built sales forecasting and demand planning model mainly based on Holt Winters, SARIMAX and Prophet
⦿ Created customized dashboards to transform forecasting results and important KPIs by Qlik Sense
Flights Delay Analysis and Prediction (Nov 2018 – Dec 2018)
⦿ Merged data files and cleaned noisy data for predictive analytics of delay reasons, airline companies etc.
⦿ Created dummy variables and applied Pearson correlation to filter out variables.
⦿ Split data by using K fold cross validation, then applied logistic regression, K nearest neighbor and random forest in both sklearn and mllib packages.
⦿ Utilized confusion matrix and ROC to find the best model which was random forest with accuracy of 0.88.
Market Analysis Based on New York City Taxi and Limousine Data (Oct 2018 – Nov 2018)
⦿ Processed data and made explorations on trip distances, terminations, fares etc. with SQL and Pandas
⦿ Clustered trips on pick-up and drop-off locations utilizing K means clustering respectively.
⦿ Classified customers’ preference and made suggestions to the company using Tableau.
Prediction of the Customers’ Buying Behavior of Fixed Time Deposit (Nov 2017 – Dec 2017)
⦿ Preprocessed data and analyzed customers’ information (job scopes, education level, housing loans etc.)
⦿ Exploited Pearson correlation and PCA for feature selection.
⦿ Applied and compared model performances by using decision tree, random forest, K nearest neighbor and SVM.
⦿ Tuned parameters and evaluated models (to prevent over-fitting) and found random forest as the best classifier.
Tools: Python, SQL, R, Alteryx, SAS, Eviews, Spark, AMPL, MATLAB, Excel, Tableau, SPSS, Minitab, C, Qlik Sense
Master of Engineering in Industrial & Systems Engineering (Aug 2017 – May 2019)
Lehigh University Bethlehem, PA
B.E in Electrical Engineering & Automation (Sep 2013 – Jul 2017)
Beijing Technology & Business University,