A senior data scientist with over 15 years of experience in data analysis, machine learning, NLP, and research and 7+ years as a data scientist. A leader in machine learning R&D and data strategy. Also responsible for building end-to-end AI solutions and data products.
Hire CraigResponsible for developing strategic roadmap and building data products to meet company goals, identifying gaps and value-add opportunities in client’s data strategy and architected a data strategy roadmap from proof-of-concept implementation to production pipeline, identified gaps in efficiency, data governance, provenance, and privacy as well as value-add opportunities in new data products, researched and prototyped novel algorithms for data products including cohort segmentation, customer friction analysis, NLG summarization and auto-comments, architected, designed, and collaboratively implemented an end-to-end automated ML pipeline for intelligent customer behavior analysis and orchestration, and collaborated on a UX design for a financial risk model dashboard.
Developed set of neologism analytics used for nowcasting event detection in streaming data, developed analyst-driving classifier for semi-supervised tagging of news articles, novel distributed approach to Query Expansion using Spark for vocabulary building, custom Word2Vec implementation, and a Golang backend for fast, scalable, concurrent query retrieval, developed parameter server running on top of Spark for parallelizing sequential ML algorithms. Developed a novel approach to graph clustering by leveraging averaged word embedding vectors and Hierarchical Agglomerative Clustering. Also built models to detect risks of credit, liquidity, and fraud within payments systems.
Analyzed textual data, built and refined representative NLP models, wrote linguistic analysis suite to include language identification, transliteration, morphological categorization, and lemmatization, researched and implemented lexical-based and ML approached to sentiment analysis, explored entity detection to identify and track crisis events, researched novel ways of classifying unstructured data using ML and rule-based approaches, researched novel and robust topic modeling approach to short texts using biterm frequencies over the entire corpus, used TensorFlow to create distributional semantic model of text, and experimented with applying Technical Analysis approaches from finance to non-financial time series data in order to better identify outliers, inflection points, seasonality, and trends.