sathishkumar NR
About Candidate
Linux, Aws,Azure,Devops,Ansible,Kubernets,Terraform,Rundeck,Cookbook,chef,CI/CD,Servicenow Splunk,SRE Practices, completed Generative AI Mastermind ,Change ,Project, Transition Management, Client/Vendor management Patch &,People Management, Project Delivery,DataCenter Migration,ISO Audit,CMDB &,ITSM,ITIL,Problem Incident,Release Management,Infrastucture IT Service Manager support and operations
● Managed and maintained over 1,458 Linux servers across multiple clients, ensuring high availability and optimal performance.
● Automated server configuration and deployment processes using Ansible and Terraform, significantly decreasing deployment times.
● Designed and implemented comprehensive monitoring and alerting systems with Splunk and CloudWatch, minimizing downtime and ensuring rapid issue resolution.
● Managed a large-scale IT environment supporting over 4,000 servers for 300+ clients, delivering stable Unix/Linux operations and compliance.
● Led Kubernetes cluster deployment and management across on-premise and AWS EKS environments, ensuring high availability of critical applications.
● Developed multiple Terraform modules to streamline application and service automation, improving infrastructure consistency and efficiency.
● Implemented disaster recovery strategies, including DR testing, to ensure business continuity for critical systems.
● Automated routine operational tasks such as password management and employee offboarding using Automic Automation, enhancing security and efficiency.
● Established and maintained HA solutions for CRM and PCS clusters, ensuring high resilience and minimal downtime.
● Managed patching, change, incident, and problem management processes, driving continuous service improvement and compliance.
● Led infrastructure automation initiatives integrating Nagios, shell scripts, and custom tools to proactively address system issues.
● Delivered strategic service delivery management, including vendor management, service reporting, and quarterly review meetings.
● Ensured security and compliance in Azure cloud environments through internal audits and partner collaborations.
● Managed CMDB, release, and project transition processes, supporting seamless service delivery and operational excellence.
● Implementing automation in all Infra by creating tenant access
● Managing the server status to monitor the current status of server and fix them in case of any issues
● UC4 id created, enabled script and installed server to reach automic automation.
Infrastructure operation:
● Strategic Service Delivery Management
● management updates, status reporting
● conduct monthly or quarterly service reviews
● Responsibility related to vendor management
● Azure cloud Partner with security, compliance
● Azure in Various Internal Audits
● Developing and maintaining a cloud service catalogue
● DR-Testing
● Release Management
● Patch Management
● Change Management
● Incident Management
● Problem Management
● Process Improvements
● CMDB-Configuration Management
● Project Transition Management
● People Management
● Server Availability Report
● Server Capacity Report
● Problem Report(Ticket)
● Drive the team to ensure a stable Unix/Linux environment and as an SME actively participate in all critical & escalated issues and Compliance and vulnerability fixing through automation.
● Managing 4000+ servers for 300+ customers,
● Managed Kubernetes cluster for multiple customers, on premise and Amazon EKS
● Experience in Infra as Code and using AWS services
● Experience in setting up the build and deployment automation with Terraform and scripts using Jenkins,
● Worked on Ansible, used, Patching, and admin tasks,
● Deployed configuration management and provisioning to AWS using Terraform
● created multiple terraform modules to manage application and service automation,
● installed and configured log collections and monitoring using ELK and Nagios
● Disaster recovery planning and implementation
● Experience with deploying and upgrading mission critical application on Kubernetes cluster,
● Designed and implanted Automation tool to fix issues by integrating Nagios and shell script,
● Provide services, solutions, guidance, and expertise and promote the development of content, data, information, and knowledge management.
● Provide and support corporate solutions for internal and external collaboration in Cloud, On-Prem DCs
● Designed and implemented HA solutions with (CRM, PCS) clusters
Location
Education
Work & Experience
Linux, Aws,Azure,Devops,Ansible,Kubernets,Terraform,Rundeck,Cookbook,chef,CI/CD,Servicenow Splunk,SRE Practices, completed Generative AI Mastermind ,Change ,Project, Transition Management, Client/Vendor management Patch &,People Management, Project Delivery,DataCenter Migration,ISO Audit,CMDB &,ITSM,ITIL,Problem Incident,Release Management,Infrastucture IT Service Manager support and operations
Linux, Aws,Azure,Devops,Ansible,Kubernets,Terraform,Rundeck,Cookbook,chef,CI/CD,Servicenow Splunk,SRE Practices, completed Generative AI Mastermind ,Change ,Project, Transition Management, Client/Vendor management Patch &,People Management, Project Delivery,DataCenter Migration,ISO Audit,CMDB &,ITSM,ITIL,Problem Incident,Release Management,Infrastucture IT Service Manager support and operations ● Managed and maintained over 1,458 Linux servers across multiple clients, ensuring high availability and optimal performance. ● Automated server configuration and deployment processes using Ansible and Terraform, significantly decreasing deployment times. ● Designed and implemented comprehensive monitoring and alerting systems with Splunk and CloudWatch, minimizing downtime and ensuring rapid issue resolution. ● Managed a large-scale IT environment supporting over 4,000 servers for 300+ clients, delivering stable Unix/Linux operations and compliance. ● Led Kubernetes cluster deployment and management across on-premise and AWS EKS environments, ensuring high availability of critical applications. ● Developed multiple Terraform modules to streamline application and service automation, improving infrastructure consistency and efficiency. ● Implemented disaster recovery strategies, including DR testing, to ensure business continuity for critical systems. ● Automated routine operational tasks such as password management and employee offboarding using Automic Automation, enhancing security and efficiency. ● Established and maintained HA solutions for CRM and PCS clusters, ensuring high resilience and minimal downtime. ● Managed patching, change, incident, and problem management processes, driving continuous service improvement and compliance. ● Led infrastructure automation initiatives integrating Nagios, shell scripts, and custom tools to proactively address system issues. ● Delivered strategic service delivery management, including vendor management, service reporting, and quarterly review meetings. ● Ensured security and compliance in Azure cloud environments through internal audits and partner collaborations. ● Managed CMDB, release, and project transition processes, supporting seamless service delivery and operational excellence. ● Implementing automation in all Infra by creating tenant access ● Managing the server status to monitor the current status of server and fix them in case of any issues ● UC4 id created, enabled script and installed server to reach automic automation. Infrastructure operation: ● Strategic Service Delivery Management ● management updates, status reporting ● conduct monthly or quarterly service reviews ● Responsibility related to vendor management ● Azure cloud Partner with security, compliance ● Azure in Various Internal Audits ● Developing and maintaining a cloud service catalogue ● DR-Testing ● Release Management ● Patch Management ● Change Management ● Incident Management ● Problem Management ● Process Improvements ● CMDB-Configuration Management ● Project Transition Management ● People Management ● Server Availability Report ● Server Capacity Report ● Problem Report(Ticket) ● Drive the team to ensure a stable Unix/Linux environment and as an SME actively participate in all critical & escalated issues and Compliance and vulnerability fixing through automation. ● Managing 4000+ servers for 300+ customers, ● Managed Kubernetes cluster for multiple customers, on premise and Amazon EKS ● Experience in Infra as Code and using AWS services ● Experience in setting up the build and deployment automation with Terraform and scripts using Jenkins, ● Worked on Ansible, used, Patching, and admin tasks, ● Deployed configuration management and provisioning to AWS using Terraform ● created multiple terraform modules to manage application and service automation, ● installed and configured log collections and monitoring using ELK and Nagios ● Disaster recovery planning and implementation ● Experience with deploying and upgrading mission critical application on Kubernetes cluster, ● Designed and implanted Automation tool to fix issues by integrating Nagios and shell script, ● Provide services, solutions, guidance, and expertise and promote the development of content, data, information, and knowledge management. ● Provide and support corporate solutions for internal and external collaboration in Cloud, On-Prem DCs ● Designed and implemented HA solutions with (CRM, PCS) clusters