Resume
As a Senior Engineer in Data Platform at Sophos, I have been involved in creating a new Apache Iceberg-based data pipeline for our data lake. Before this, my primary focus involved creating a new batch detection framework using the data available in the data lake to help the threat hunters detect potentially customer-compromising detections before they happen.
I am leading and mentoring a high-performing team of 5+ engineers to work on cross-team collaboration, elevate code standards, and cultivating a culture of continuous improvement and ownership.
I have used my skills in AWS, Java, Spring, Python, Relational, and Non-relational databases to deliver high-quality solutions that increase transparency, accountability, and efficiency for our clients. Additionally, I have been mentoring 5+ junior engineers in software development best practices and enhancing their technical competencies.
Before joining Sophos, I worked at o9 Solutions, Goldman Sachs, Eureka AI, and Samsung Research Institute, where I gained valuable experience in developing proprietary databases from scratch, building web applications, handling data pipelines in a large-scale system, building desktop applications, building user tracing systems based on their constantly changing geography, and building UX using cutting-edge technologies. I have a Master’s degree in Computer Science and Engineering from IIT Kanpur and a Bachelor’s degree in the same field from IIT Guwahati.
I am passionate about exploring new horizons, taking on new challenges, and solving intricate problems in the software engineering world.
Contact Information
-
Email: [email protected]
-
LinkedIn: itspawanhere
-
Website: itspawanhere.com
Education
-
Indian Institute of Technology, Kanpur
Location: Kanpur, India
Degree: Masters in Computer Science
Dates: July 2016 - May 2018 -
Indian Institute of Technology, Guwahati
Location: Guwahati, India
Degree: B.Tech in Computer Science and Engineering
Dates: July 2010 - May 2014
Professional Experience
-
Sophos, Bangalore, IN
Senior Software Engineer II
October 2023 - Present -
Building Apache Iceberg Based Datalake: Built a new Apache Iceberg pipeline to pull data from XDR, MDR, firewall, and other sources, replacing our Hive-based legacy pipeline. Queries now run about twice as fast, storage needs dropped by roughly 30%, and data is available in the S3 data lake over 50% sooner. The pipeline also added write support for S3 and time-travel queries, allowing us to track changes automatically for auditing purposes.
Tools/Technology used: Spring Boot, Kafka, Iceberg, Glue, Athena, Grafana, Logz, Metabase.
-
Batch Detections & ETL: Created batch processing framework in Amazon Managed Workflows for Apache Airflow (MWAA), using which threat researchers can register their detection workflows. Features included running your detection in multi-phases, segregating different types of customers based on licenses, and running ETL jobs before the detection. This new detection framework offered by Sophos helped automate threat detections approximately 10 times faster.
Tools/Technology used: AWS, Airflow, Kafka, Trino, Avro, Redis, Python, Terraform
-
AWS Cluster Migration: Managed AWS infrastructure using Terraform to provision, migrate, and tear down clusters as needed. For example, I shifted ECS workloads from Fargate to EC2, set up auto-scaling based on demand, and right-sized instances per service. These changes cut infrastructure costs by several thousand dollars annually, improved resource utilization, and made deployments more consistent and reliable.
Tools/Technology used: AWS, Terraform, Unix Scripting.
-
Mentoring Work: Led and mentored a high-performing team of 5+ engineers, driving adoption of modern engineering practices, elevating code standards, and cultivating a culture of continuous improvement and ownership.
-
o9 Solutions, Bangalore, IN
Senior Software Engineer
August 2021 - September 2023-
Lightweight Scenario and Scenario Sharing: Scenarios are meant for doing a what-if analysis on the dataset. Implemented a special type of scenario called “Lightweight scenarios” for the o9 proprietary database that incurs the minimum amount of data replication and memory usage. Also, implemented the feature of scenario sharing, using which a copy of the whole what-if version of the dataset can be shared in read-only/write mode with users/roles.
Tools/Technology used: C#, SQL Server, Redis, RabbitMQ, Solr, MongoDB, Visual Studio.
-
Audit Trail: Implemented robust auditing system that tracked all transactions on the o9’s database, increasing transparency and accountability; reduced transaction tracking time by 10x and improved data integrity.
Tools/Technology used: Java, Spring Boot, Hadoop, Lucene, Protobuf, HBase, SQL Server
-
Hadoop Agent: Collaborated on a project to synchronize data between o9’s proprietary database and Hive. Developed a service for seamless data and schema synchronization, enabling unified access to Hive data.
Tools/Technology used: Java, Spring Boot, Hadoop, Hive, SQL Server
-
Health Check for Services: Designed and executed a cutting-edge solution that enabled o9’s Java-based services to transmit their health and configuration status to a centralized dashboard regularly. This approach facilitated adherence to service-level agreements and allowed for timely remedial action by tracking machine failures and incorrect configurations. The outcome was a significant improvement of more than 20% in our SLA.
Tools/Technology used: Java, Spring Boot, RabbitMQ, Elasticsearch, Kibana
-
-
Goldman Sachs, Bangalore, IN
Engineering Associate
January 2021 - July 2021-
Lumos: Undertook the task of augmenting “LUMOS”, an internal desktop application used by GS. The primary objective was centralizing open workflows, tickets, and notifications for all GS employees within a unified and intuitive desktop app. Added the feature of adding message boards, and outage notifications, and getting the least TAT on any open tickets by the support team. This led to an improvement in efficiency by over 30%.
Tools/Technology used: Angular, Electron, TypeScript, SQLite, HTML, CSS, Java, Spring Boot
-
-
Eureka AI, Bangalore, IN
Senior Software Engineer
June 2018 - January 2021-
Eureka Omni: Developed a versatile tool called Omni, which enables the creation of customized dashboards using pre-existing or new widgets to generate reports in diverse formats. The solution provides a streamlined approach to report creation and empowers clients to effortlessly share and disseminate information with stakeholders.
Tools/Technology used: Cube.js, Spring Boot, React
-
Contact Tracing: Developed a user tracing system using Telco Data to identify potential sources of COVID-19 infection with user consent. Incorporated Point of Interest data to help identify potential infection sites. The system was used by government agencies to mitigate the spread of the virus.
Tools/Technology used: Apache Druid, Leaflet, Spring Boot, React
-
Credit Score: Engineered an end-to-end system to ingest and query credit score data using the Eureka AI engine. The system enabled queries via mobile number or national ID, providing direct access to user credit scores.
Tools/Technology used: Elasticsearch, Spring Boot, Bootstrap, React
-
Data Pipeline for Telco Dataset: Built an advanced data transformation tool, “pipecore”, capable of automating the seamless integration of data from various Telcos into our proprietary internal format, optimized for use with AI engines. This tool also facilitated the efficient transfer of data between databases, including but not limited to Hive and Elasticsearch, ensuring smooth and uninterrupted data flow across our systems, and hence enhancing our data pipeline efficiency by over 3x.
Tools/Technology used: Java, Hadoop, Oozie, Spark, Hive, Python, Kafka, Elasticsearch, Kibana
-
Eureka User Profile Store: Designed and developed the UI and backend of the Eureka User Profile Store utilizing RESTful APIs. The platform allowed for the exploration of Telco Users and the extraction of specific user sections using multiple filters. Additionally, Implemented a suite of data visualization tools that dynamically responded to selected filters, empowering clients to generate intelligence reports for their target users. The success of this initiative resulted in 6 telcos across 5 countries contracting our services.
Tools/Technology used: Java, Spring Boot, Elasticsearch, React, Bootstrap, ChartJS
-
Campaign Management System: Developed a system using which a Campaign Director can create and manage campaigns for any set of target customers. The major challenge involved matchmaking of the best users for each campaign during simultaneous runs. By using CMS, Eureka managed to run 10-100 campaigns in parallel with the users being tracked through the funnel alongside the complete BI and analytics integration for the campaigns.
Tools/Technology used: Spring Boot, MySQL, React
-
-
Samsung Research Institute, Bangalore, IN
Software Engineer
June 2014 - July 2016-
Splitbilling: Engineered a robust and scalable billing system that enables system administrators to efficiently monitor call, SMS, and data usage for both personal and enterprise accounts through a unified dashboard. As a result, this initiative yielded significant cost savings for the company by mitigating expenses related to personal usage.
Tools/Technology used: Python, MongoDB, Flask, JavaScript, Bootstrap
-
Email Glue: Led the development of an advanced email automation system, capable of automatically creating tasks and events based on email content, thus enabling a zero-inbox policy for the Samsung email application. In addition, architected a Google Drive add-on for Samsung Email, allowing mail attachments to be directly integrated with Google Drive. This comprehensive solution improved email management efficiency, enhancing overall productivity while ensuring data integrity and accessibility.
Tools/Technology used: Android, Java, Google Drive SDK, Android Studio
-
Awards and Honors
-
Jury’s Favorite Award, Hackathon
Won Jury’s favorite award at a Hackathon organized by o9 Solutions for building a customer support bot using LangChain and OpenAI’s ChatGPT APIs.
September 2023 -
Employee of the Quarter, o9 Solutions
Earned recognition as the recipient of the prestigious “Employee of the Quarter” award at o9 Solutions, in honor of outstanding contributions and exceptional achievements in the workplace. January 2023 -
Tizen App, Samsung
Won a Tizen Samsung Phone for creating a holiday planner app on Tizen Store. May 2015 -
Top 0.18% Rank, Joint Entrance Examination
Achieved a ranking within the top 0.18% among over 500,000 candidates.
May 2010 -
Merit-cum-Means Scholarship, IIT Guwahati
Awarded annually for outstanding academic performance over four consecutive years.
2010 - 2014 -
Certificate of Merit, CBSE
Honored for being among the top 0.1% of students nationwide in Social Science.
May 2008 -
First Prize, State-Level Essay Competition
Awarded first place in a competitive state-level essay writing event.
May 2007
Skills and Other
-
Programming Languages: Java, Python, JavaScript, SQL, Unix Scripting
-
Tools & Frameworks: Spring Boot, Git, React, NextJS, Chakra-UI, Tailwind, ShadcnUI, Jenkins, GraphQL, React, Angular, Electron, Docker, Kubernetes (K8s), Terraform, CI/CD, Nginx, LaTeX
-
Industry Knowledge: AWS, GCP, Airflow, Hadoop, Hive, HBase, Presto, Iceberg, ELK Stack, MongoDB, Firebase, PostgreSQL, RabbitMQ, Kafka, Redis
-
Additional Skills: Consistently able to type at 90+ WPM.
Languages
- English: Fluent
- Hindi: Native