VR Vicknesh Rethinavelu / Cloud Platform Engineer Currently @ MongoDB § 01 · About
Bengaluru, India 9+ years Open to roles

Vicknesh
Rethinavelu

Cloud Operations Engineer · Platform Engineering Specialist

Senior Cloud Platform Engineer building self-healing infrastructure at MongoDB. Previously led a team of 5 at NoBroker — saving ₹10L+ annually on logging infra and shipping a self-service staging platform for 100+ developers.

9yrs
Cloud
experience
91%
CKA score
top 9th pct
3
Major clouds
AWS · GCP · Azure
5DevOps
Team led
2 years
01

About

Nine years deep in cloud platforms, container orchestration, and the quiet engineering of systems that need less attention over time — through automation, observability, and self-service.

I spend my days designing systems that take operational toil and convert it into Go code, RabbitMQ queues, and Temporal workflows. The work I'm proudest of looks small from the outside: a logging pipeline that's 10× cheaper, a staging environment that spins up in under five minutes, a Go service that quietly closes 38% of incident tickets before a human reads them.

I'm CKA-certified at 91% (top 9th percentile), AWS/GCP certified, and have led a team of five through a PCI DSS audit. I write about what I build — two technical articles on Medium have reached over a thousand engineers.

02

Measurable impact

38%
Manual ops work eliminated
MongoDB · Go automation
₹10L/year
Infrastructure cost saved
NoBroker · ClickHouse migration
10:1
Log compression achieved
LZ4HC · 5K+ logs/sec
100+
Developers unblocked
Project Blackbox · staging-as-a-service
03

Experience

Cloud Operations Engineer

MongoDB · Atlas platform
Dec 2023 — Present2 years

Building Go services that categorize, correlate, and auto-resolve Atlas infrastructure issues across AWS, GCP, and Azure — turning 3 years of repetitive playbook work into deterministic automation.

Platform engineering

  • Automated infrastructure issue resolution. Built a Go service that categorizes incoming Jira tickets, compares cluster goal-state vs. current state, and auto-applies playbook fixes — eliminating 38% of manual ops work. Stack: Go, RabbitMQ, MongoDB, Jira APIs, Kubernetes, Prometheus, Grafana, Splunk.
  • Gateway service migration. Removed RabbitMQ from the alerting hot-path by building a single gateway backed by MongoDB Changestreams and the Temporal framework. Idempotent child workflows eliminated race conditions and cut alert processing time.
  • Proactive cloud outage detection. Engineered a multi-source correlation system with a sliding-window algorithm (1-hour lookback) that produces confidence scores from Prometheus alerts and cloud-provider health dashboards, then auto-creates tracking incidents.

Atlas operations

  • Resolve Atlas infrastructure incidents surfaced by Go-planner automations across AWS, GCP, and Azure.
  • Coordinate with cloud-provider TAMs for RCA on integration failures.
  • Perform operational actions (resync, cluster maintenance) and prevent customer impact through proactive alerting.

Senior DevOps Engineer

NoBroker.com · led team of 5
Aug 2021 — Dec 20232 yrs 5 mo

Led a team of 5 DevOps engineers driving platform initiatives. Owned PCI DSS certification for NBPay and led the ClickHouse migration that saved ₹10L+ annually.

Leadership

  • Led a team of 5 DevOps engineers over 2 years, driving platform initiatives, conducting code reviews, and mentoring on Kubernetes, CI/CD, and cloud best practices.
  • Established team processes for incident response, on-call rotations, and infrastructure-as-code reviews.

Security & compliance

  • PCI DSS certification lead for the NBPay payments system — server hardening, cloud security controls, network segmentation, audit logging.
  • Standardized vulnerability scanning, access controls, and encryption at rest / in transit.

Technical wins

  • ClickHouse migration. Replaced Elasticsearch with ClickHouse — ₹10L+ annual savings, 10:1 compression with LZ4HC + TTL expressions, 5K+ logs/sec throughput. Published on Medium.
  • Awarded Star Performer for cost & performance impact.
  • Promoted to Senior DevOps Engineer recognizing technical leadership.

DevOps Engineer

NoBroker.com
Jun 2020 — Jul 20211 yr 2 mo

Implemented Linkerd service mesh across microservices, architected the initial Kubernetes infra for NoBrokerHood, and shipped Project Blackbox — staging-as-a-service for 100+ devs.

Service mesh

  • Implemented Linkerd as sidecar proxy across microservices to solve inter-service communication between NoBroker and NoBroker Search.
  • Enabled traffic monitoring, load balancing, and rate limiting — eliminated race conditions and improved reliability.
  • Gained observability into request flows, latency, and service dependencies.

NoBrokerHood

  • Architected initial Kubernetes infrastructure for the society-management platform.
  • Built nginx automations for application scaling and performance tuning.

Project Blackbox

  • Self-service staging platform on Docker Swarm — eliminated bottlenecks for 100+ developers.
  • Jenkins multi-branch pipelines (Groovy 47.5%) with Perl automation (27.1%); Swarm + Prometheus + Loki monitoring; Git-to-HTTPS automation via Traefik + Let's Encrypt with dynamic subdomains.
  • Published on NoBroker Engineering Medium — read article →

DevOps Engineer

DXC Technology
Apr 2019 — Jun 20201 yr 3 mo

Multi-cloud infrastructure management across AWS, GCP, and Azure. Led the Terraform 0.11 → 0.12.1 migration — SPOT Award for delivery.

  • Multi-cloud infrastructure management (AWS, GCP, Azure).
  • Led Terraform 0.11 → 0.12.1 migration project.
  • Received SPOT Award for exceptional project delivery.

Technical Lead

Cognizant Technology Solutions
Jun 2017 — Apr 20191 yr 11 mo

Production application-server management and middleware operations. SSL automation. Client Performer Award for operational excellence.

  • Production application server management and middleware operations.
  • SSL certificate migration and automation.
  • Awarded Client Performer for operational excellence.

Senior System Engineer

Cognizant Technology Solutions
Jun 2015 — Jun 20172 yrs 1 mo

Application server management and middleware support. Production environment maintenance and troubleshooting.

  • Application server management and middleware support.
  • Production environment maintenance and troubleshooting.
04

Selected projects

01MongoDB

Automated infrastructure issue resolution

Self-healing Atlas · Go automation
−38% manual work 3 years of toil replaced

Go service with a node-based execution graph using a modular executor pattern. Fetches Jira tickets, compares Atlas cluster goal-state vs. current, applies playbook fixes, or escalates to customers.

STACK GolangRabbitMQMongoDBJira APIKubernetesPrometheusGrafanaSplunk
02MongoDB

Gateway service migration

Architecture simplification · event-driven
Race conditions eliminated Faster alert processing

Removed RabbitMQ dependency by building a single gateway service. MongoDB Changestreams for event-driven processing, Temporal framework for idempotent child workflows.

STACK GolangMongoDB ChangestreamsTemporalTTL
03MongoDB

Proactive cloud outage detection

Correlation engine · confidence scoring
Auto-incident creation 1-hour sliding window

Multi-source correlation engine using a sliding-window algorithm with a 1-hour lookback. Correlates cluster signals with cloud-provider health dashboards, produces confidence scores, auto-creates tracking incidents.

STACK TemporalMongoDBPrometheusSliding Window
04NoBroker

ClickHouse logging infrastructure

Cost & performance overhaul · zero-downtime migration
₹10L+ annual savings 10:1 compression 5K+ logs/sec

Led Elasticsearch → ClickHouse migration. Researched ClickBench benchmarks, studied Zerodha/Cloudflare deployments, executed zero-downtime cutover. Published an article reaching 1,000+ engineers.

STACK ClickHouseFluent BitRedashGrafanaLZ4HC
05NoBroker

Project Blackbox

Staging-as-a-service · developer platform
100+ developers enabled <5 min Git→HTTPS

Complete self-service staging platform. Git → Jenkins → Nexus → Portainer → Swarm → Traefik with Let's Encrypt SSL and dynamic subdomains.

STACK Docker SwarmJenkins/GroovyPerlTraefikPortainerPrometheusLoki
06NoBroker

NoBrokerHood infrastructure

Kubernetes architecture · platform scaling
Scaled from launch

Architected initial Kubernetes infrastructure for the society-management application. Built automation scripts, nginx configurations, and scalable environments for platform expansion.

STACK KubernetesNginxAutomation
05

Skills

Show

Programming
& Automation

5 areas
Golang advanced Python advanced Bash/Shell expert Groovy/Jenkins DSL advanced Perl advanced

Containers
& Orchestration

4 areas
Kubernetes CKA 91% Docker & Swarm expert Linkerd Service Mesh advanced Temporal advanced

Cloud
Platforms

3 clouds
AWS certified Google Cloud certified Azure advanced Multi-cloud architecture expert

CI/CD
& Infra

4 tools
Jenkins 7+ years Terraform advanced GitOps expert Spinnaker advanced

Observability
& Data

6 tools
Prometheus & Grafana expert ClickHouse expert Splunk advanced Fluent Bit expert Loki advanced Elasticsearch 7+ years

Security
& Compliance

6 areas
PCI DSS lead Server Hardening advanced Network Segmentation advanced Cloud Security Controls advanced Vulnerability Mgmt advanced Audit Logging advanced
06

Credentials

Certified Kubernetes Administrator

The Linux Foundation 91% · Top 9th pct

Certified Kubernetes Application Developer

The Linux Foundation

Associate Cloud Engineer

Google Cloud

Solution Architect Associate

Amazon Web Services

CloudEndure Migration

Amazon Web Services

WebSphere

IBM

B.Tech, Electrical & Electronics Engineering

Pondicherry Engineering College · 2011 — 2015

Capstone: Modelling and analysis of high-boost DC–DC converter — published in IEEE.

07

Publications

Real-Time Log Analysis and Cost-Efficient Log Storage

LinkedIn / Medium · Dec 2023 · 1,000+ readers

Technical deep-dive on migrating from Elasticsearch to ClickHouse — architecture decisions, compression strategy, performance benchmarks, and the path to ₹10L+ savings.

Read on LinkedIn →

Project Blackbox — Our Mysterious Staging Environment

NoBroker Engineering on Medium · Sep 2020

Architecture guide for self-service developer environments built on Docker Swarm, Jenkins, and Traefik. Covers automation workflows and scalability patterns.

Read on Medium →
08

Awards

  • 2023 Star Performer — for ClickHouse migration & cost optimization NoBroker
  • 2021 Promotion — to Senior DevOps Engineer for technical leadership NoBroker
  • 2020 SPOT Award — for exceptional Terraform migration delivery DXC Technology
  • 2018 Client Performer — for operational excellence Cognizant
09

Beyond work

Sport

Badminton

Active player — recreational and competitive. Strategy, quick decisions, teamwork. The good kind of tired.

Side build

Android development

Mobile app exploration — a different end of the SDLC, which keeps my empathy for the developers I support sharp.

Game

FIFA on Xbox

Pattern recognition, adaptation, and recovering from a losing scoreline — strangely transferable skills.

Home lab

Pi-hole + Unbound DNS

Network-wide ad-blocking on a Raspberry Pi. Recursive DNS resolver, custom blocklists, monitoring — no third-party DNS providers.

Open source

Traefik community

Active contributor — helping users in the community forum debug issues I've already encountered.

Side project

Wedding invitation site

Custom-built and hosted on GitHub Pages. Schedule, gallery, venue map — the most personally meaningful deploy on my résumé.

10

Let's talk

Open to senior platform engineering, cloud infrastructure leadership, and SRE / DevOps team-lead roles. Remote, hybrid, or on-site.

LinkedIn
GitHub
Location
Bengaluru, India