Infrastructure Engineer - Remote / Telecommute

  • Cynet Systems
  • Cary, North Carolina
  • Full Time


Primary Monitoring and Incident Response:
  • Azure Monitor, Splunk, Dyna Trace, and custom dashboards.
  • Respond to alerts and triage P1/P2 escalations via ServiceNow war rooms, performing initial diagnosis and remediation where possible.
  • Incident / Change / Exception process adherence.
Capacity and Availability Management:
  • Identify scaling opportunities with virtual machines or service as required and identify zone redundancy patterns for performance.
  • Keep track of capacity forecasts and proactively identify performance bottlenecks.
Backup and Restore Operations:
  • Execute frequent backups (Azure Backup, NetApp Snapshots) and perform basic restore tasks to ensure business continuity.
  • Conduct routine backup verifications/tests to confirm data integrity.
Access and Permissions Management:
  • Maintain Azure/NetApp file shares, setting up and adjusting access controls and AD group permissions according to organizational policy.
  • Perform periodic identity and access reviews to ensure the principle of least privilege.
Logging and Metrics Oversight:
  • Oversee monitoring agents (e.g., Splunk, Dyna Trace, Azure Alerts, System Pulse), ensuring they are up to date and generating the right alerts/metrics for L2 to act upon.
  • Collaborate with L3 to fine tune alert thresholds and logging when chronic issues emerge.
Basic Performance Testing:
  • Execute routine performance checks (e.g., load or stress tests) in coordination with L3 teams when potential service degradation is suspected.
  • Document and escalate consistent performance anomalies.
Skills Set:
  • Comfortable reading and troubleshooting logs/metrics (Splunk, Dyna Trace, Azure Monitor).
  • Familiar with Azure Backup services, basic restore procedures, and file share permissions.
  • Proficiency in ticketing systems (ServiceNow), collaborating with other technical teams for escalations.
  • Sufficient knowledge to follow runbooks and standard operating procedures (SOPs).
  • Documentation of standard operating procedures and IaC changes should be continuously updated in a central repository (e.g., Git repos).
  • Familiarity with Epic implementations (on-prem / cloud).
Job ID: 478124320
Originally Posted on: 5/22/2025

Want to find more Construction opportunities?

Check out the 176,069 verified Construction jobs on iHireConstruction