Henrique Gibi de Pádua


Fazenda Grande, Jundiaí/SP – Brazil

+55 (11) 958-086-328 – henriquedepadua@yahoo.com.br

Site Reliability Engineer | AWS | Observability | Python

Skills

  • AWS
  • SRE
  • Observability
  • Monitoring
  • Python
  • SQL/NoSQL
  • IaC
  • Troubleshoot
  • Incident response
  • RCA

Personal Statement

Site Reliability Engineer with over six years of experience operating and improving production systems in cloud-native environments. Strong background in observability, incident response, and reliability engineering, applying structured operational practices to reduce instability and improve service resilience on AWS-based platforms.

Experienced in working within high-criticality and regulated environments, combining technical depth with operational discipline. Skilled in transforming metrics, logs, and incident history into actionable reliability improvements, helping teams evolve from reactive support models toward more proactive, automation-driven practices.

Recognized for strong analytical thinking, stakeholder communication, and accountability in production systems. Passionate about building reliable, scalable, and observable systems while contributing to the continuous maturation of engineering and reliability practices.

Certifications

AWS Solution Architect - Associate

Comprehensive understanding of AWS services and technologies. Demonstrated the ability to build secure and robust solutions using architectural design principles based on customer requirements. Able to strategically design well-architected distributed systems that are scalable, resilient, efficient, and fault-tolerant.

AWS Cloud Practitioner

Fundamental understanding of IT services and their uses in the AWS Cloud. Demonstrated cloud fluency and foundational AWS knowledge. Able to identify essential AWS services necessary to set up AWS-focused projects.

IOX Hub Tech - SRE Observability Metrics

Certified in "Observability Metrics", has theoretical knowledge about observability metrics in applications and infrastructure. You can apply this knowledge autonomously, understanding concepts, application architecture and correlations between metrics, APM and logs.

Oracle Cloud Infrastructure Foundations - Associate

Demonstrate fundamental knowledge of public cloud services provided by Oracle Cloud Infrastructure.

Kaspersky Certified Endpoint Security for Business

Fundamental of administration Kaspersky Endpoint Security for Windows. Demonstrated ability to set up and troubleshoot most of issues. Able to support business customers.

Kaspersky Certified Home and Small Business products

Fundamental understanding of Kaspersky home and small business products, and set up. Demonstrated ability to set up and troubleshoot most of issues. Able to support end customers.

Languages

Academic Formation

Video Presentation

  • This is a short presentation in video format, where I introduce you quickly to:
  • Who I am
  • Certifications
  • Core competences
  • How I work
  • Key Strengths
  • Contacts

Of course this is just a short version.

If you want to know more, you can ask me at anytime.

Personal Project

Rogue-mage Adventure

A game has been developed that collects telemetry, generating user experience metrics. The project is in progress and can be followed here.

It uses a data pipeline to transform telemetry to fit a predetermined patten. The architecture also uses Lambda, AWS Secrets Manager, SQS, RDS, API Gateway.

Recent Professional Experiences


ThoughtWorks

1 year
Jun/2025 - Jan/2026

Consultant Cloud Engineer

  • Site Reliability Engineering in Production Environment
  • Acted as SRE for a production-grade application, focusing on stability, performance, and operational resilience.
  • Participated in on-call rotations, triaging incidents and coordinating recovery actions within defined SLA targets.
  • Contributed to strengthening operational maturity through structured incident handling and proactive reliability initiatives.
  • Observability Ownership and Reliability Insights
  • Owned end-to-end observability strategy for the application, covering logs, dashboards, alerts, and service health indicators.
  • Designed and maintained Splunk searches and dashboards to monitor error patterns, latency behavior, and availability signals.
  • Improved alert quality by reducing noise and increasing signal relevance, minimizing operational fatigue.
  • Transformed operational data into actionable reliability improvements and engineering backlog priorities.
  • Incident Management and Operational Governance
  • Structured incident documentation and response workflows using Jira and Confluence.
  • Standardized runbooks and post-incident documentation to improve knowledge reuse and onboarding efficiency.
  • Promoted consistency in incident reporting, including timestamps, impact assessment, and corrective actions.
  • Helped evolve team practices toward a more reliability-driven and data-informed operational culture.
  • Reliability Metrics and Continuous Improvement
  • Analyzed error recurrence patterns and operational metrics to identify systemic reliability gaps.
  • Classified incidents by category and business impact, enabling prioritization based on frequency and risk exposure.
  • Measured trend improvements before and after corrective actions to validate operational effectiveness.

Concentrix

1 year
May/2024 - May/2025

Senior Software Support

  • Professional Responsibilities and Expertise in Customer Service
  • Multichannel Contact and Record Keeping: Experienced in receiving and managing user contact through phone, email, and web, ensuring comprehensive documentation in specific tools according to each operation:
  • Conducting initial user interaction, understanding their requests, and directing them to the appropriate solution specialist as per defined processes.
  • Proactively contacting users to gather additional information when needed for efficient issue resolution.
  • Ensuring adherence to high-quality standards set by the department and management.
  • Recording incidents in management systems with the detail and formatting defined in operational processes.
  • Consulting second-level support for guidance on processes and technical issues not outlined in current documentation.
  • Leveraging operational support tools to provide solutions for reported cases, ensuring positive user experiences.
  • Effectively directing cases to the appropriate support levels as established in the processes, ensuring SLA (Service Level Agreement) compliance.
  • Continuously monitoring operational results and user satisfaction, suggesting improvements for process enhancement.
  • Staying connected with support systems and tools necessary to ensure comprehensive and quality service.
  • Work Hours Monitoring: Accurately tracked work hours through software integrated with telephone systems, ensuring precise time management.
  • Results and Continuous Improvement: I have consistently aligned my performance with user satisfaction goals and process improvements, actively contributing with suggestions and innovations that added value to the service provided.

Itaú Unibanco

1 year, 8 months
Jan/2022 – Out/2023

Software Engineer

  • Software Engineering in a Highly Regulated Financial Environment
  • Developed and maintained backend systems in a large-scale banking ecosystem with strict compliance, security, and audit requirements.
  • Worked in mission-critical environments where availability, resilience, and data integrity were non-negotiable.
  • Ensured adherence to enterprise standards for reliability, traceability, and operational governance.
  • Cloud-Native Architecture on AWS
  • Designed and implemented scalable cloud solutions using AWS services such as SQS, Fargate, and Lambda.
  • Built distributed systems capable of handling asynchronous workloads and high transaction volumes.
  • Applied fault-tolerant architectural patterns to improve system resilience and scalability.
  • Observability and Production Monitoring
  • Implemented monitoring strategies using Grafana and Prometheus to track service health and performance metrics.
  • Used Splunk for log analysis, anomaly detection, and operational investigations in production incidents.
  • Collaborated with cross-functional teams to reduce incident recurrence and improve production stability.
  • Containerization and Event-Driven Systems
  • Used Docker for containerized application deployment, ensuring consistency across environments.
  • Integrated Kafka for real-time event processing in high-throughput scenarios.

Other courses/capacitations


Application of Software Engineering in Emerging Systems

Evolution and improvements of Software Engineering processes. Development and management of projects with DevOps. Software engineering in applications for mobile devices, WEB applications (WebApps) and game development.


Java with Spring Boot

Life cycle Java Core development and Java using Spring as framework on Web Server and Web Apps.


AWS Solution Architect Associate

Comprehensive understanding of AWS services and technologies: strategically design well-architected distributed systems that are scalable, resilient, efficient, and fault-tolerant. Click here to verify.


AWS DevOps - Dynamic Website

Hosted a dynamic website using some AWS serverless services, such as CDN, S3 and Lambda. Click here to verify.


TDD in Java

Test Driven Development: write test first, then the code itself. Understand the pairing code. Click here to verify.


Technical Writing

How to write API Software documentation with swagger with example. Click here to verify.


Java Software Development - OOP - Spring Web

The most complete Java Software Development OOP and with Spring for Web in brazilian Portuguese. Click here to verify.


Node.JS Introduction

A Node.JS elementary course. Builded a simple API using Express. Click here to verify.


Git + GitHub

Tool used to manage multiple versions of source code edits that are then transferred to files in a Git repository, GitHub serves as a location for uploading copies of a Git repository. Click here to verify.


Algorithms and Programming Logic

Complete course of algorithms and programming logic covering the languages C, C++, C#, Java and Python. Click here to verify.

Older Professional Experiences


GFT do Brasil

8 months
Jan/2021 – Ago/2021

AWS Builder | System Analyst

  • Serverless Architecture with AWS
  • Serverless solutions using Amazon CloudWatch Events and Scheduled Events
  • Solutions with AWS Lambda for efficient and scalable system
  • Significant optimization of operational costs
  • High availability and quick response to real-time events
  • Application Design for Fault Tolerance
  • Focus on creating applications with maximum fault tolerance and self-healing
  • Careful design practices and rigorous testing
  • Resilience and reliability in high demand scenarios or unexpected failures
  • Optimization and Redesign of Critical Components
  • Identification and analysis of critical components in technologies used
  • Proposed and executed redesigns to increase robustness and reliability
  • Contribution to increased operational efficiency
  • Reduction of downtime risks
  • Global Accessibility and Reliability
  • Ensured applications were accessible and trusted globally
  • Implementation of data replication strategies and load balancing services
  • Consistent, high-quality user experience regardless of location

Foundever

3 years
Jan/2018 – Jan/2021

Service Desk | Kaspersky Certified Technician

  • Kaspersky Certified Technician – available only for Kaspersky Analysts
  • End customer technical support provider
  • Reach a service level of 96% (efficiency and customer satisfaction)
  • Tickets: 75% solved in a First Contact Resolution
  • No escalation needed: 96% of tickets
  • General Troubleshoot (Windows, Android and MacOS)
  • Network configuration to prevent ransomware attack
  • Generate and maintain a backup copy, for prevention
  • Control access: avoid using remote access (most attacks occur through this way)
  • Remote access to workstations and servers (remote offices is not a problem)
  • Setup DNS and Windows registry manipulation
  • LOGs collections (Wireshark, ProcMon, Traces)
  • Possible errors may not have been detected when designing the software
  • Some of these errors happen only in certain environments
  • BUG tracing
  • Problems can happen with each operating system update
  • There are many different programs in several different branches, and we can detect incompatibilities

Stefanini

1 year
Dec/2016 – Oct/2017

Helpdesk Analyst

  • Technical support (Helpdesk and Backoffice)
  • Analysis of internal system processes
  • Create operational and management reports
  • Application installation and removal
  • Technical support daily user's issue
  • Keep servers and system working correctly
  • General troubleshoot in all Latin America and any country when needed
  • Usability with Office 365 and IBM solutions

Mühlbauer

3 years
Aug/2012 – Oct/2015

Service Engineer

  • Installation and maintenance of RFID machines Worldwide
  • Daily report about project status, technical issues, solutions, etc
  • Multitask Troubleshoot: IT, machinery, process, material, level 2 support
  • Traveling to participate in international training, meetings and courses
  • Living abroad (Argentina) for 1 year to provide support to a customer
  • Machine operator training (English and Spanish)
  • Therefore, I had previous training in Germany and Malaysia, the company's subsidiaries.
  • I was also in charge of training and ramp-up of resourcefulness on the part of the client machines.

Santa Casa de Misericórdia de Belo Horizonte

1 year
Sep/2011 – Aug/2011

Technical Team Leader

  • Technical Team Leadership in a hospital environment
  • Led a team of 12 technical professionals responsible for maintenance and support of medical devices and infrastructure.
  • Coordinated daily activities, workload distribution, and technical prioritization to ensure service continuity.
  • Acted as the main point of accountability for team performance and service delivery quality.
  • Stakeholder and Client Expectation Management
  • Managed expectations of internal hospital stakeholders, ensuring alignment between technical constraints and operational needs.
  • Communicated incident impacts, resolution timelines, and preventive measures to clinical and administrative teams.
  • Balanced urgency, safety, and resource allocation in a high-criticality healthcare environment.
  • Vendor and Supplier Relationship Management
  • Interfaced directly with equipment suppliers and third-party service providers for maintenance contracts and issue resolution.
  • Negotiated service expectations, delivery timelines, and technical requirements.
  • Ensured compliance with maintenance standards and optimized supplier performance.
  • Process Improvement and Cost Optimization
  • Oversaw deployment of a system to manage medical device maintenance lifecycle and inventory control.
  • Reduced clinical engineering operational costs through supplier optimization and process standardization.

Medsystem Hospitalar

1 year
Jun/2010 – Jun/2011

Clinical Engineering - Internship

  • Internship in a Clinical Engineering company that provides services in a hospital in Sorocaba. Preventive and corrective maintenance of medical equipment and supply of hospital compressed air. Provision of technical support in the sectors, for example, emergency room, intensive care unit for children and adults, operating room, maternity. Zeal to the stock of medical gases in order to never let him end up in the hospital.
  • Support urological surgery video, accompany the doctor during the procedure, assisted by: surgical camera, laser, surgical arcing device (on live radiography), and any complications related to the equipment.
  • Despite the above mentioned. I developed a list of preventive maintenance protocols for an anesthesia cart, which made the process of checking the items more practical, taking less time.

Fatec-SO

1 year
May/2007 – Jun/2008

Mechanical Design - Internship

  • Approved in an internal selection process, I was assigned to:
  • Attend students (like myself, at the time) in the Mechanical Design Lab and support the use of AutoCAD® software, solving doubts about its operation.
  • I provided range of services in Multidisciplinary Laboratory so that students could use the academic computers for academic purposes, although it was not related to mechanical design.
  • Participate in various activities such as posters and slideshows at science fairs and for candidates for the entrance exam.

2º GAC-AP

3 years
Mar/2003 – Mar/2006

Soldier - Brazilian's Army

  • Military Training:
  • Meticulousness
  • Precision
  • Collective Feeling
  • Graduation training for higher levels
  • Activities with society

Contact

📞 +55 11 958-086-328

📧 henriquedepadua@yahoo.com.br

LinkedIn GitHub Twitter Reddit

Download my PDF resume

Resume - Henrique

Jundiaí/SP — Feb 18th, 2026.

I place myself at your disposal for further clarification.