Platform.sh is a remote-first global workforce building a better cloud platform to create, manage and responsibly scale web applications.
As a collective with diverse backgrounds, we work together to test, innovate, and challenge one another, finding new ways to reimagine digital experiences. We’re here to help our customers thrive.
Bring your experience to our team, and help us build a better way.
To reinforce our technical prowess, we are looking to grow our operations team. If you’re looking for an exciting, high-growth opportunity with an award-winning, cutting-edge company, this could be just the job for you.
We're looking for an Operations and Service Reliability Engineer with a taste for Python and Go, great Linux system understanding, and a real hunger for the challenges of building robust, distributed systems.
What you can expect
Platform.sh is a PaaS shrouded in a lot of black magic (we can consistently clone a whole running cluster, with its state, databases, indexes in a matter of seconds). We want to get this down to the hundreds of milliseconds domain. Interested? There is more...
Our external API is pure Hypermedia REST + oAuth on top of Pyramid. It mechanizes the Git layer and needs more features.
We can consistently generate from the same manifest a Docker container, an LXC one, or VM disk images (AWS, Azure, OpenStack), we want more targets.
We probably have the highest industry container density. We need to get it higher.
We support any Python, Ruby, NodeJS or PHP, Java and .NET, time to roll-out Elixir, of course, Elixir (and Rust. We need Rust).
Directly reporting to Director of Operations and Engineering and in close collaboration with Customer Support and the rest of Engineering teams, you will be responsible for:
- Cloud operation : configuration of clusters and systems, deploying changes and container images, provision capacity and help Support team debug production issues.
- Automating: with the rest of Engineering teams work on automating all processes, and improve, secure and update existing automation.
- Reliability: maintain core infrastructure services as code, work on monitoring systems and capacity planning.
- Quality: be part of on-call schedule to receive real-time alerts, write or update documentation and runbooks for alerts created by services, and respond to incidents.
What you bring
- Proven successful experience in an operations role
- The ability to successfully manage cloud-based infrastructure for a fast growing organization
- Experience with containerization technologies
- Exposure to cloud services such as AWS, Azure, GCP, etc
- Understanding how an OS works, networking knowledge, how git works, and the constraints of a distributed system,
- Puppet experience
- Proficiency in Python (Golang a plus)
Nice to have
- Knowledge of Magento Ecommerce, Symfony, Drupal, eZ Platform, or Typo3
- Ability to cover weekends
Note: we don't like stress, so we build everything to be robust and resilient, but stuff does break. This is a role with on-call duties and fire drills. If this fills you with dread... well, this might not be a fit for you.
A typical month in our team would look like this
- Development week: writing the code and the automation to make our infrastructure run smoothly, from Puppet, Go, Python, and it really goes from monitoring tasks up to self healing & updating
- Deploy week: every that goes live on PSH is deployed by us, and the project managers assign those updates of clusters to whom is working during that week (during the off hours). We're always improving :)
- Escalation week: whenever there's a tough problem support can't deal with, the team is investigating why, and our team help solve it
- On-Call week: whenever a person is on-call, we don't add anything to that person, so that teammate has time to learn something new while being available in case something happens
This is a remote job. Work from anywhere in Canada!
We’re a worldwide, distributed team looking for the best talent. Our remote model has been in practice and thriving since 2014. To us, remote work means flexibility and having truly diverse, global teams. A clear and concise written communication style is required for success in the role and the company. The cover letter to your application will be the first test of this metric.
To maximize team collaboration, this role is preferred in the West of Canada.
Company benefits and perks
- An innovative product you can believe in. We’re sustainably changing the way companies develop and manage their web applications
- We’re voted as A Best Place to Work by 96% of our employees, Forbes Top 30 Companies for Remote Jobs, and in France Best Workplaces for Women
- Hands on leadership that cares in a flexible, open work environment, where your voice is encouraged. We can always find ways to do better and look forward to hearing your ideas
- A global team, rich with culture and diversity
- Company-wide DE&I initiative that you can be a part of
- Annual international company and team meetups (when we're not experiencing a pandemic)
- Wellness stipend and Professional development budget
- Office equipment budget
- Fair PTO (standards based on location)
- Inclusive parental leave (timeline based on location)
- Healthcare, dental, and vision (US, CA, UK, and FR employees only)
- Tandem – a pool of linguists from around the world willing to help each other work on learning new languages
- Additional compensation for on-call ops and support employees
- Company shares (discretionary)
- Unlimited Platform.sh accounts
How we hire
We know that a great hire won’t meet every requirement that we’ve outlined. If you can see yourself elevating the team, we want to hear your story. Few of us would be here had we not taken a chance.
You can expect 3 – 4 interviews on Google Meet. You will have the opportunity to meet with a variety of Platformers throughout the interview process. You’ll also have the opportunity to schedule virtual coffee chats with potential future peers to see if you can envision working together. Use interview and coffee time to make sure the company aligns with what you’re looking for in your future working environment.
Expect a higher number of interviews for director-level roles and above. All roles require background checks.
About our software
Platform.sh is a unified, secure, enterprise-grade platform to build, run, and scale fleets of websites and applications. We are trusted by 5500+ organizations globally to help create innovative digital experiences.