I Am a Data Engineer a Solutions Architect a Software Engineer a Solutions Engineer

Hello there!

I am an Information Systems Graduate Student at SDSU with a passion for technology and an innate storytelling ability

I believe that words and data have the power to reshape the world. Having discovered my love for coding at a young age, my interest in the field has only intensified over time, particularly in the realm of Data.

I bring over 4+ years of Experience as a data engineer and have unique ability to bridge the gap between technology and business by finding innovative solutions to complex problems.

Furthermore, my dedication to social causes and volunteer work highlights my commitment to making a positive impact beyond the realms of technology with a mission to combine the power of technology and business with the aim of driving societal change. My technical acumen, storytelling prowess, and unwavering dedication to making a difference make me a force to be reckoned with in the field of Information Technology.


Education
  • San Diego State University

    Masters in Information Systems | San Diego

    Majored in Systems Analysis and Data Science.

    Aug 2021 - May 2023

  • Jawaharlal Nehru Technological University

    Bachelor's in Information Technology | Hyderabad

    Aug 2016 - May 2020


My Skills
Python/SQL90%
Machine Learning95%
Big Data (Spark/Databricks)85%
Cloud Platforms (AWS/Azure)85%
DevOps & CI/CD80%
Data Engineering90%

Experience
  • Data Engineer
    Google@Compass, Mountain View, CA

    Apr 2024 - Present

    - Led a 5-member data engineering pod for Finance BI, automating month-end workflows and delivering reports 24 hours faster.

    - Architected an Airflow-driven alerting pipeline streaming Spark job metrics to Splunk and Grafana, reducing incident MTTR by 70%.

    - Designed behavioral-science dashboards in Tableau & Looker that tripled self-service analytics adoption across finance and product teams.

    - Built a real-time bid-optimization engine (Python & Spark Streaming) processing 5 K+ bids/sec, boosting win-rate 22% and saving $1.2 M annually.

    - Developed Python/SQL churn and ARR models, informing $5 M in new ARR decisions.

    - Optimized lakehouse tables and workloads, cutting pipeline runtime 75% and cloud spend 30%.

    - Unified web analytics and application logs into a BigQuery customer-360 dataset for segmentation and targeted experiments.

  • Data Architect
    MedImpact Healthcare Systems Inc., San Diego, CA

    Sep 2023 - Feb 2024

    - Led next-gen digitization and OCR of ~500 K pharmacy-benefit questionnaires and handwritten claims using Python, Tesseract, and OpenCV—cutting manual entry time by 40% and saving $250 K.

    - Architected data pipelines with REST APIs, Snowflake schemas, Spark, Hadoop, and Hive to process ~10 M daily events under HIPAA compliance, enabling anomaly detection and scalable ML workflows.

    - Dockerized microservices with Git version control and Jenkins CI/CD, reducing deployment cadence from weekly to daily.

    - Developed real-time Tableau and Power BI dashboards for 200 K patients and 120+ clinicians, boosting analytics engagement by 35%.

    - Conducted root-cause analyses to raise data-trust scores to 95% and cut escalations by one-third.

    - Partnered with finance to realize \$600 K in incremental revenue within six months of rollout.

  • Data Engineer - Intern
    Tideworks Technology, Seattle, WA

    Jun 2022 - Aug 2022

    - Partnered with port-operations teams to streamline berth scheduling and reduce average vessel turnaround time by 18%.

    - Ingested IoT sensor streams from 100+ global terminals via Kafka into BigQuery, creating a single source for real-time logistics KPIs.

    - Developed Spark ML models that forecast berth availability and congestion; achieved 87% accuracy and informed daily dispatch decisions.

    - Built Tableau and Splunk dashboards visualizing dwell time, crane utilization, and queue lengths, enabling data-driven adjustments on the pier.

    - Authored dbt tests and a metadata dictionary to enforce schema standards and cut new-hire onboarding by 30%.

    - Contributed to analytics OKRs and delivered insights that shaped two new SaaS features for port clients.

  • Student Research Assistant
    San Diego State University, San Diego, CA

    Aug 2021 - May 2023

    - Running data models under the guidance of a professor to perform data cleaning, analysis, documentation, and co-authoring a journal.

    - Investigating and developing new methodologies and architecture for the Port of San Diego by integrating IoT, AI, blockchain, cybersecurity, supply-chain, and ML technologies in a multidisciplinary approach spanning Computer Science, Information Science, and Business Management.

    - Utilized Bloomberg Terminal to extract stock‐market data for research, achieving a 95% efficiency rate through advanced analysis tools.

    - Created 5+ dashboards and 20+ charts in Tableau, leveraging stock‐market data for data visualization and insights.

  • Technology Associate Web Master
    United Nations Volunteer, Surfside, FL

    Sep 2020 - Mar 2023

    -Developed responsive website with payment gateways for Save the Water, achieving 30% increase in mobile traffic and improved cross-device user experience.

    -Built Angular 10 application using TypeScript, NgRx, and RxJS with reusable components, REST/SOAP web services, WebSockets, and comprehensive form validations for efficient data handling.

    -Led full software development lifecycle from wireframing and requirements gathering to deployment, utilizing modern web technologies (CSS3, HTML5, Java) to deliver user-friendly interfaces and maintainable code architecture.

  • Software Engineer - Data
    Eagle Infra India Pvt. Ltd., Mumbai, India

    Dec 2018 - Jul 2021

    - Designed cloud-based SmartCity data lake integrating 50+ IoT sensors via Fivetran into Spark/Snowflake, with real-time anomaly detection pipelines using Kafka and Spark Structured Streaming, reducing incident response time by 25%.

    - Built demand forecasting models (ARIMA, LSTM) for utilities achieving <8% MAPE accuracy, and delivered Power BI/Tableau dashboards to 75 municipal decision-makers, shortening planning cycles by 30%.

    - Automated Airflow workflows improving data freshness by 90 minutes and reducing manual ETL effort by 50%, while mentoring 3 junior engineers and documenting best practices to increase team velocity by 15%.

My Interests

Data Engineering & Analytics

Designing and building scalable data pipelines, ETL processes, and real-time analytics solutions. Specializing in big data technologies, cloud platforms, and data warehousing to transform raw data into actionable insights.

Machine Learning & AI

Developing intelligent systems and predictive models using advanced machine learning algorithms. From computer vision applications like traffic detection to financial forecasting with LSTM networks, creating AI solutions that drive business value.

Cloud Architecture & DevOps

Building robust, scalable cloud infrastructure and implementing CI/CD pipelines. Expertise in AWS, Azure, and containerization technologies to ensure high availability, performance, and seamless deployment of data-driven applications.

25

Data Pipelines Built

1

Million+ Records Processed Daily

5

Years of Professional Experience

99.9

% System Uptime Maintained

My Portfolio

All Data Engineering Machine Learning AI & Analytics

Publications

Get In Touch

Phone

+1 650 918 2990

Email

ridhithakur826@gmail.com

Address

San Francisco, California