Akamai is looking for a Media Reliability Engineer who will be responsible for the reliability and availability of Akamai services, sharing the responsibility with the product team to diagnose, mitigate and solve outages. You will face some of the most complex challenges in distributed systems at scale and help us deliver the products our customers need in order to succeed.
Your Mission as a Media Reliability Engineer
Provide support to the highly available systems that will run across thousands of data centers and all major Geographyâ€™s
Work with the Products/Developers and Support teams on improving Akamai Media & Storage products every day.
Help others to be awesome by sharing your knowledge with them.
Take initiative in improving our products in small or large ways.
Contribute to solution designs to address critical issues and complex problems.
Recommend viable solutions to processes, technology, and interfaces that improve the effectiveness of the team and reduce technical debt.
Solve problems and make use of automation/Feedback to make sure they will not happen again.
Work closely with product engineers to advocate reliable and scalable system design for Supportability/Resilience and reliability.
Perform on-call duty as part of a team responsible of the Support, availability and performance of all Akamai Media Products.
Develop tools ideas / prototypes to assist Akamai Support engineers as well as to proactively monitor service performance and availability.
About the Team
At Akamai, Media Reliability Engineers are highly independent and self-organized individual contributors who work together as a tight team focused on building the most reliable system. Media Reliability Engineers have a mix of system and software engineering knowledge, they are working closely with product engineers to design reliable and scalable system designs and sustain our exponential growth.
Required Education and Experience
Applicants must meet one of the following education and experience requirements:
5 years of relevant experience and a Bachelorâ€™s degree or
3 years of relevant experience and a Masterâ€™s degree or
Relevant experience and a PhD or
Equivalent professional experience
5+ years experience with cloud computing, hosting, network or Streaming and Storage
5+ years experience with at least one of the following languages: Python, Go, C/C++, Java or Perl/Shell
Experience with Web Services & Cloud API
Experience in Linux administration and troubleshooting
Experience of fundamental technologies like TCP/IP, HTTP, and DNS
Deploy and maintain components on a large-scale environment
Good communication skills
Self-motivated and a capacity to get things done
Capacity to adapt and learn quickly
Experience with version control, Perforce/Git
Strong scripting and automation skills
Experience operating large-scale, distributed systems
Experience with public cloud platforms
Support/DevOps experience in a Linux based environment
Experience in Networking, Streaming or Storage in a 24/7 environment