Vijay is a Bigdata Engineer with expertise and experience in Banking, Financial, and Telecom Services domain. Over 10+ years of extensive experience in software development. Worked on SDLC delivery methodologies like Agile and Waterfall. Cloudera Certified Spark and Hadoop Developer with strong expertise in Hadoop stack, HDFS, Map Reduce, Sqoop, NiFi, Kafka, Flume, Pig, Hive, Spark, Scala, Python, and Java. Expertise in implementing CICD pipelines in a Production environment. Basic learning and willingness to work in any Cloud platform. Major strengths are familiarity with multiple software systems, ability to learn quickly new technologies, adapt to new environments, self-motivated, team player, focused adaptive and quick learner with excellent interpersonal, technical, and communication skills.Hire Vijay
Project: Liquidity, MTT, STaRS, Client profitability
• Provide end to end solution using conformance framework (spark, Scala, spring)
• Reading the data from hive raw layer, polishing/cleaning data based on mapping documents, and dump into impala layer based on business requirements.
• Created main, pre-post validation scripts.
• Spark optimization/tuning/performance
• Building end to end CICD pipeline (Lightspeed, UCD, Jenkins)
• Creation of change requests for production deployment with detailed documentation and taking approval from appropriate application managers.
• End to end production deployments activities and given support and validation.
• Monitor spark UI and cluster usage, spark optimization
• Manage small DEV team, trained and support
• Coordination with BA, ingestion, MSTR, stack holder, PS Team
Project: Design and implement Kafka-Spark Pipeline for data analytics.
Description: Using Kafka-spark cluster taking data from routers and sending it to the pub-sub system.
• Router big data analysis and processing with complex computation and iterations using Spark + Kafka streaming applications.
• Web Portal development and maintenance in platform project which involves building a publish/subscribe system for internal clients.
• Building the ETL processes for a variety of network carrier data.
• Develop and maintain the Publish/Subscribe system to different datatypes.
• Reading, processing, and parsing data files through Spark/Scala.
Project: Clickfox integration with Telus Data lake cluster and data pipeline configuration for data analytics using clickfox application.
• Developed the code for Importing and exporting data from Telus Data lake to clickfox cluster.
• For data processing created 18 nodes of a cluster within Data lake for clickfox application.
• Clickfox integration with bigdata cluster and data pipeline configuration for data analytics using clickfox application.
• Responsible for data review and preparation.
• Responsible to manage data coming from different sources and filtering the data according to click-fox Standard event format requirements using hive.
• Creating the Spark Scala application for reading processing and storing into Hive.
• Using SparkSQL, Dataframe, and dataset for this application.
• Creating a control table for logging the status of an application.
• Bitbucket is used for version control.
• Creating the CI/CD pipeline using Teamcity and uDeploy.
• Autosys is used for scheduling the job for automatic deployment.
• Creating a shell script for running the job by checking files have been loaded.
• Agile methodology is a general practice and Jira is used as a tracking tool.
• Coordination with a data scientist, clickfox infra and data prep team, data admin, and business team.
Project: Design and implement Hadoop platform to support enterprise-wide batch, real-time, and ad-hoc data analytics and consumption
• Responsible to manage data coming from different sources.
• Developed the code for Importing from Netezza warehouse and exporting data into HDFS and Hive using Sqoop.
• Used MapReduce for reading, cleaning the data, and storing it into HDFS.
• Created Hive external tables by using partitioning applied on top of it.
• Used Experienced in managing and reviewing Hadoop log files.
• Involved in defining job flows using Oozie for scheduling jobs to manage Apache Hadoop jobs by Directed Acyclic Graph (DAG) of actions with control flows.
• Setup real-time data ingestion using Apache Kafka and Flume and storing the data into HBase.
• Designed and Developed reports and dashboards for the Finance domain using Tableau.
Project: Customer Accounts and Financial Management
• Involved in user requirements and specifications from users.
• Documented business requirements, functional specifications, and test requirements.
• Identified the key facts and dimensions necessary to support the business requirements.
• Prepared prototype models to support the development efforts.
• Developed various complex SQL queries, stored procedures, functions, packages, and Database Triggers.
• Involved in the design, development, and testing of the PL/SQL stored procedures, packages.
• Developed Linux Shell scripts to automate repetitive database processes and maintained shell scripts.
• Provided input into updates and modifications of the product documentation and support the testing team to deliver the highest product quality to the customers.
• Involved in analysis, estimating, design, development, testing, and 24 x 7 production supports.
• Worked On – DB2 9.5 on windows and Linux.
• DB2 installations on Windows, Linux, fix pack applies.
• User Administration – DML, DDL rights to the user, profile set, etc.
• DB2 utilities: -Reorg, run stats, backup (offline, online), restore, db2top
• Hands-on & troubleshooting experience on DB2.
• Restore of Database b/w same platform and b/w diff. Platform (export/import, load, db2move).
• Managed performance and tuning of SQL queries and fixed the slow-running queries.
• Worked with relational database models, schemas & entity-relationship diagrams (ERDs) to create technical/system design documents.
• Experience in Maintenance of databases & Troubleshooting for Production, Test & Development environment Basic script knowledge to automate db2 backups, db2diag alerts, etc.