Lead Big data Engineer/Engineering Manager with experience in building large scale distributed data pipelines (Processing 100B events/day) right from data ingestion to actionable insights. Experience with cross cluster, cross geography data pipelines. Currently doing both hands on development of the data reporting platform to develop near real time capabilities, building a scalable data warehousing solution for business analysts, and also managing a team to achieve state of the art reporting capabilities.
Engineer with proven academic and professional credentials, and bright background, looking to make an impact with strong technical, analytical and professional skills. Intrigued by new technology and its adoption, would like to leave a long lasting footprint on the technological and entrepreneurial roadmap.
Skills and Tech : Hadoop, MapReduce, Java, Hbase, Hive, Pig, MySQL, Vertica, Tableau, Kafka, Solr
Currently Learning : Statistical Data Mining, Deep Learning/ Neural Networks @ Google AI
Specialties: Strong Analytic and Mathematical Skills. Love for Computer Science Fundamentals, Emphasis on clean and scalable code, Fast Grasping ability, Strong Prototyping skills, Idea Generation, Leadership Ability and Experience
Senior Software Engineer LinkedIn
Public Company; 10,001+ employees; Part of msft;
Dec 2017 - Present Sunnyvale CA
Member of the Data Org at Linkedin.
Creating scalable Data Warehousing solutions based on Spark and Presto.
Working on Autonomous Data Quality Frameworks
Engineering Manager (previously Senior Software Engineer) Drawbridge
Startup Company, 100+ employees
August 2016 - December 2017 San Mateo, CA
Led and Managed the backend reporting and analytics team of Drawbridge's Ad Platform products.
Also let the ads targeting team for brief time as a joint responsibility.
Both hands on development of the data reporting platform to develop near real time capabilities and also managing a team to achieve state of the art reporting capabilities.
Worked on reporting pipelines/Data pipelines that process more than 100 Billion events per day.
Product facing enhancements for the data pipeline.
Drove and engineered a Presto based Data Warehouse, that enabled Business Analysts, BI engineers and Data Scientists to query the datasets on HDFS directly as against a serving layer columnar relational DB thereby increasing the amount people can query and use in their models.
Analytics dashboards based on Data Warehousing
Backend Technologies : Hadoop, Hive, Pig, Spark, Hbase, Couchbase, Oozie, Columnar DBs, etc.
Key Project Contributions :
1. Conversion Tracking/Attribution https://drawbridge.com/blog/p/the-drawbridge-approach-to-attribution
2. Cross Device Insights with Real time Attribution Metrics https://drawbridge.com/c/insights&attribution
Software Engineer @WalmartLabs
Public Company; 10,001+ employees; wmt;
December 2013 - August 2016 Sunnyvale, CA
@Labs - WMX (Walmart Exchange)
- Data Mining and Analytics for Online Ads
- Ad Impression Ingestion and Processing : As an early stage member on the team implemented and fully owned (dev, test, and product) the Ingestion pipeline processing upto 1B impressions per day from various sources.
- Designed and implemented terabyte scale analytic pipeline
- Worked in a small team to build a lambda-architecture realtime analytics and OLAP platform from the ground up
- Worked with Hadoop/Hive and Apache Spark to process ad impressions data.
- Pioneered use of Apache Spark on the team for improved pipeline performance and maintainability
Real time impression ingestion with Apache Storm (experimental).
Real time analytics.
- Built a querying system using Apache Solr to build analytics item sets using Clustering of Short Strings in Large Datasets
Languages used: Java, Python
Tech Yahoo! - Software Dev Engineer Yahoo!
Public Company; 10,001+ employees; yhoo; Internet industry
July 2012 – December 2013 Sunnyvale, CA
Media Foundation - Content
Development related to Content Enrichment for all ingested content in the Content Agility pipeline, and served up on all of Yahoo Media properties (Yahoo News, Yahoo Sports, Yahoo Finance, etc.)
End to end Design and Development related to selection of most relevant Canonical URLs and contextual clickthrough URLs for content. These power all Yahoo hosted content urls on all content streams including the Yahoo homepage.
Sole developer for Canonical URL generation for all content in the Enrichment Workflow to achieve Search Engine Optimization for each content (story, photo, video)
Independently Developed Content Enrichment Libraries for Categorization, Content Quality marking, Geo Classification, Named Entity Recognition, etc. on content (stories, photos, videos) through Feed Ingestion and Editorial Ingestion Workflows using Yahoo’s Contextual Analysis Platform.
Enabled the Ads Categorization to be tagged for each content.
Development within Category Taxonomy Management Workflow for Yahoo Content Taxonomy.
Setting up of Continuous Integration Environment for Content Track’s Libraries.
Ramping up towards development related to Cassandra and HBase. (Performance Testing of Hector and Astyanax client and different data storage schemas for low latency).
Graduate Teaching Assistant Carnegie Mellon University
Educational Institution; 5001-10,000 employees; Higher Education industry
August 2011 – September 2012 (1 year 2 months)
Fundamentals of Embedded Systems
Software Engineer Intern Samsung Telecommunications America
Public Company; 1001-5000 employees; Telecommunications industry
June 2011 – August 2011 (3 months)
Development of Browser Performance Bench-marking Suite for Android Devices
Software Engineer Avaya
Privately Held; 10,001+ employees; Telecommunications industry
July 2009 – July 2010 (1 year 1 month)
Worked for Avaya's R&D (TS&D) team which develop tools & components for easy maintenance & fault-resolution of Avaya's suite of products. Worked on the Expert Systems and HealthCheck products.
University Representative Vishwakarma Institute of Technology, Pune
Educational Institution; 201-500 employees; Higher Education industry
August 2008 – May 2009 (10 months)
Held the apex position in the College Student Council. Student coordinator for the University led implementation of various schemes in the college. Also responsible for the various social, cultural, technical and sports related activities, events and festivals in the college.