Site Reliability Engineer (SRE)

Company: Procurement Sciences
Location: Washington
Posted on: February 1, 2025

Job Description:

Company Overview: Procurement Sciences AI () is at the vanguard of generative artificial intelligence, transforming the government contracting sector as a Series A rocketship, proudly backed by Battery Ventures, a top 1% global technology leading venture capital firm. As a venture-backed B2B SaaS entity, we are dedicated to revolutionizing federal, state, and local business approaches to government contracting with disruptive AI capabilities. Our team is committed to addressing customer pain points through an AI-first strategy, ensuring our solutions are effective and ahead of the curve. Our flagship platform, celebrated for its "Win More Bids" value proposition, enhances revenue streams for our clients while driving unparalleled operational efficiencies. By harnessing the power of generative AI, tailored for the government contracting domain, we offer a unique competitive advantage. Our collaboration with Battery Ventures provides the resources and support to rapidly scale our innovations, redefining success standards and promising a quantum leap in value generation and operational excellence for our clients.Job Description:We are looking for an experienced, tenacious, and driven Site Reliability Engineer (SRE) to join our team. The ideal candidate will be responsible for ensuring the reliability, performance, and scalability of our systems. This role will focus on performing root cause analysis, designing and implementing automated testing, monitoring key service level indicators (SLIs), and ensuring adherence to service level agreements (SLAs) and service level objectives (SLOs). The successful candidate will have a strong background in Kubernetes, Helm, observability platforms, and cloud providers such as Azure.Key Responsibilities:

Perform root cause analysis to identify and resolve system or application issues in a timely and effective manner, often in communication with developers.
Design and implement a broad range of automated tests to ensure system reliability and performance.
Build scalable and cost-effective observability patterns in Datadog or other monitoring providers.
Monitor and analyze SLIs to ensure adherence to SLAs and SLOs.
Collaborate with development and operations teams to improve system reliability and developer experience (DevEx).
Develop and maintain monitoring and alerting systems to proactively address issues.
Implement best practices for incident management and disaster recovery.
Respond to and manage incidents, performing post-mortem analyses to prevent recurrence.
Plan and implement capacity upgrades, ensuring scalability and performance.
Automate repetitive operational tasks and develop tools for system automation.
Define, monitor, and manage SLAs, ensuring service levels meet or exceed expectations.
Ensure systems comply with security and regulatory requirements.
Identify areas for continuous improvement in systems and processes.
Create and maintain documentation for systems, processes, and incident responses.Technical Requirements:
Proficient in Kubernetes, Helm, and troubleshooting in secure environments with limited or no remote access.
Expertise in observability and monitoring tools such as Prometheus, Grafana, ELK Stack, or Datadog.
Experience with cloud providers, particularly Azure and Azure Gov.
Strong understanding of microservices architecture, including Postgres and AI systems.
Expertise in automated testing frameworks and tools (e.g., integrated tests, synthetic tests, load testing, etc.).
Experience with monitoring and analytics tools to track SLIs, SLAs, and SLOs.
Excellent problem-solving skills and attention to detail. Tenacious attitude.
Strong communication skills, with the ability to work effectively in a collaborative environment.
Proficiency in programming languages such as TypeScript and Python.
Strong scripting skills in Bash, PowerShell, or similar languages.
Experience with Infrastructure as Code (IaC) tools like Azure Bicep, AWS CDK, or Terraform.
Understanding of networking principles and experience with network troubleshooting.
Strong communication and collaboration skills, with the ability to work effectively with both technical and non-technical personnel.Preferred Qualifications:
GovCon experience and/or security clearance.
Knowledge of GitOps principles. Familiarity with FluxCD.
Experience designing and building CI/CD pipelines.
Experience with other cloud providers (AWS, AWS GovCloud) is a plus.
Familiarity with security and compliance standards in cloud environments.
Prior experience in a similar role within a fast-paced, dynamic environment.
Deep experience of operationalizing new development workloads across different teams.Location: Washington, D.C./North Virginia
#J-18808-Ljbffr

Keywords: Procurement Sciences, Tuckahoe , Site Reliability Engineer (SRE), Professions , Washington, Virginia

Click here to apply!

Didn't find what you're looking for? Search again!

Let Washington recruiters find you. Post your resume for free!

Get Washington Professions jobs via email.

View more Tuckahoe Professions jobs

Other Professions Jobs

Maintenance Technician ( FL )
Description: Description:About UsWe know you have a choice about where you work, and we're excited that you are considering a career with Levco Management. Are you looking for a company with a hands-on approach, a (more...)
Company: Levco Management LLC
Location: Newport News
Posted on: 02/3/2025

CDL-A Drivers: Industry Leading Pay and Benefits!
Description: br br br br CDL-A Van Runs Open Now Explore the many perks of driving for a br company that s 100 employee-owned. Call to speak to a recruiter today 844 500-0551 At Big G Express, we like (more...)
Company: Big G Express
Location: Newport News
Posted on: 02/3/2025

Fleet Maintenance Technician (CDL PERMIT)
Description: 28.60 - 34.75 / hour br Great Pay Benefits Package br World Class Equipment, Technology Training br Target Pay of 28.60 - 34.75 / hour br ul br li Benefits -- Medical, vision and (more...)
Company: Disability Solutions
Location: Newport News
Posted on: 02/3/2025

Salary in Tuckahoe, Virginia Area | More details for Tuckahoe, Virginia Jobs |Salary

Driver - Transport Driver (tractor trailer) Newport News
Description: 15,000 / yearPart-Time / Weekend Shifts - Target Pay Rate is-- 15,000 / year br -- br Position drives a tractor/trailer truck to make local regional deliveries up to 275 mile radius of predominantly (more...)
Company: Disability Solutions
Location: Newport News
Posted on: 02/3/2025

Maintenance Technician
Description: Maintenance Technician br br Job ID br br 201588 br br Posted br br 16-Jan-2025 br br Service line br br GWS Segment br br Role type br br Full-time br br Areas of (more...)
Company: CBRE
Location: Williamsburg
Posted on: 02/3/2025

CDL A Truck Owner Operators - $2.32 Avg Loaded Mile 100% No Touch
Description: br br CDL A Truck Owner Operators - 2.32 Avg Loaded Mile 100 No Touch br Drivers, Call 844 275-1561 for more information br What We Offer: ul li Average 7000 Gross Per Week li Earn (more...)
Company: Turquoise Truck
Location: Hampton
Posted on: 02/3/2025

Local & Regional Owner Operator Positions - Class A CDL - $3K Bonus
Description: br br br CLASS A OWNER OPERATORS NORFOLK, VA - LOCAL AND REGIONAL DRIVERS - 3K SIGN ON br 866-535-6884 br Home Time: Home Daily br Position Type: Local/Regional/OTR and Part Time br br (more...)
Company: Dunavant Logistics
Location: Hampton
Posted on: 02/3/2025

CDL-A Intermodal Truck Driver - Owner Operator - Home Daily
Description: br br CDL-A Intermodal Owner Operator - Average Weekly Gross: 1,500 - 3,500 / Week Looking for an intermodal career with great weekly pay, a dependable schedule, and less hassle
Company: Forward
Location: Williamsburg
Posted on: 02/3/2025

CDL-A Owner Operator Truck Driver-80% Gross Pay
Description: SELF-DISPATCH SYSTEM MEANS YOU WORK HOW YOU WANT Call 855 246-5009 to speak to a recruiter today br br CDL-A Owner Operator Truck Driving Opportunity OpenJoin F2F Transport - designed exclusively (more...)
Company: F2F Transport
Location: Newport News
Posted on: 02/3/2025

CDL A Truck Driver OTR
Description: br br OTR Truck Drivers. Call 866 893-5919 and apply online today br br Choose Your Pay Package. Gain Great Benefits. br The end of one-size-fits-all driver pay is here. Our frills and no-frills (more...)
Company: Classic Carriers
Location: Newport News
Posted on: 02/3/2025

Loading more jobs...

Site Reliability Engineer (SRE)

Didn't find what you're looking for? Search again!

Other Professions Jobs

Log In or Create An Account