Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: site-reliability-engineering

traas-stack/chaosmeta

A chaos engineering platform for supporting the complete fault drill lifecycle.

Language: Go - Size: 31.2 MB - Last synced: about 4 hours ago - Pushed: about 5 hours ago - Stars: 299 - Forks: 51

chaos-mesh/chaos-mesh

A Chaos Engineering Platform for Kubernetes.

Language: Go - Size: 64.9 MB - Last synced: about 11 hours ago - Pushed: about 21 hours ago - Stars: 6,438 - Forks: 799

dastergon/awesome-sre

A curated list of Site Reliability and Production Engineering resources.

Size: 1.17 MB - Last synced: 2 days ago - Pushed: 6 months ago - Stars: 11,563 - Forks: 1,543

SquadcastHub/awesome-sre-tools

A curated list of Site Reliability and Production Engineering Tools

Size: 159 KB - Last synced: 2 days ago - Pushed: 10 days ago - Stars: 1,130 - Forks: 159

dastergon/awesome-chaos-engineering

A curated list of Chaos Engineering resources.

Size: 244 KB - Last synced: 2 days ago - Pushed: 5 months ago - Stars: 5,823 - Forks: 639

alexei-led/pumba

Chaos testing, network emulation, and stress testing tool for containers

Language: Go - Size: 14.7 MB - Last synced: 3 days ago - Pushed: 7 months ago - Stars: 2,707 - Forks: 193

devopness/devopness

Devopness - Painless essential DevOps to everyone

Language: TypeScript - Size: 3.97 MB - Last synced: 5 days ago - Pushed: 5 days ago - Stars: 73 - Forks: 39

ari-hacks/kubernetes-chaos-sandbox

🧪 Tutorials for running chaos experiments with litmus chaos, chaos mesh, and gremlin (includes k8s setup)

Language: Ruby - Size: 112 KB - Last synced: about 4 hours ago - Pushed: over 3 years ago - Stars: 12 - Forks: 0

jaegertracing/jaeger-ui

Web UI for Jaeger

Language: JavaScript - Size: 14.5 MB - Last synced: 19 days ago - Pushed: 20 days ago - Stars: 1,056 - Forks: 465

litmuschaos/litmus

Litmus helps SREs and developers practice chaos engineering in a Cloud-native way. Chaos experiments are published at the ChaosHub (https://hub.litmuschaos.io). Community notes is at https://hackmd.io/a4Zu_sH4TZGeih-xCimi3Q

Language: Go - Size: 126 MB - Last synced: 20 days ago - Pushed: 21 days ago - Stars: 4,192 - Forks: 649

dastergon/postmortem-templates

A collection of postmortem templates

Size: 32.2 KB - Last synced: 14 days ago - Pushed: 10 months ago - Stars: 1,230 - Forks: 414

luismendes070/googlesre Fork of google/googlesre

Size: 12.7 MB - Last synced: 20 days ago - Pushed: about 2 years ago - Stars: 0 - Forks: 0

SRE-MUC/sremuc

Site Reliability Engineering Munich Meetup Page

Size: 110 MB - Last synced: 25 days ago - Pushed: 25 days ago - Stars: 7 - Forks: 6

krootee/awesome-scalability-toolbox

My opinionated list of products and tools used for high-scalability projects

Size: 735 KB - Last synced: 3 days ago - Pushed: 9 months ago - Stars: 66 - Forks: 17

ShaylenReddy42/Seelans-Tyres

This project is a proof-of-concept - which is a rewrite of my old college project - to demonstrate my skills as a DevOps Engineer before anything else after earning the Microsoft Certified: DevOps Engineer Expert certification

Language: C# - Size: 14.1 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 6 - Forks: 0

mister0/How-to-prepare-for-google-interview-SWE-SRE

This repository includes resources which are more than sufficient to prepare for google interview if you are applying for a software engineer position or a site reliability engineer position

Size: 344 KB - Last synced: about 1 month ago - Pushed: almost 2 years ago - Stars: 633 - Forks: 171

jkpl/sre-env

Dev environment for SRE

Language: Shell - Size: 4.88 KB - Last synced: about 1 month ago - Pushed: over 6 years ago - Stars: 0 - Forks: 0

developer-friendly/blog

Technical blogs on topics of Kubernetes, GitOps, CI/CD and SRE in general. Created with ❤️ using Markdown format.

Language: HTML - Size: 1.77 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 2 - Forks: 1

chaosblade-io/chaosblade

An easy to use and powerful chaos engineering experiment toolkit.(阿里巴巴开源的一款简单易用、功能强大的混沌实验注入工具)

Language: Go - Size: 4.56 MB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 5,753 - Forks: 920

flanksource/sre-learning-resources

A curated list of resources designed to level up as an SRE engineer

Size: 45.9 KB - Last synced: 2 days ago - Pushed: over 3 years ago - Stars: 23 - Forks: 6

dastergon/common-disaster-recovery-scenarios

A list of common Disaster Recovery (DR) scenarios for software companies

Size: 2.93 KB - Last synced: about 1 month ago - Pushed: over 2 years ago - Stars: 32 - Forks: 6

dastergon/sreworkbook-templates-md

A collection templates ported from the SRE Workbook

Size: 4.88 KB - Last synced: about 1 month ago - Pushed: over 5 years ago - Stars: 36 - Forks: 16

zeroc0d3lab/awesome-sre Fork of dastergon/awesome-sre

A curated list of awesome Site Reliability and Production Engineering resources.

Size: 528 KB - Last synced: 4 days ago - Pushed: about 1 year ago - Stars: 92 - Forks: 4

woojiahao/interviews

A collection of my resources for studying for SWE/SRE interviews!

Size: 3.99 MB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 7 - Forks: 0

jnbdz/site-reliability-engineer-quickstarts

Site Reliability Engineer Quickstarts! :mechanical_arm:

Size: 7.93 MB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 0 - Forks: 0

upgundecha/howtheysre

A curated collection of publicly available resources on how technology and tech-savvy organizations around the world practice Site Reliability Engineering (SRE)

Language: JavaScript - Size: 449 KB - Last synced: 3 months ago - Pushed: 5 months ago - Stars: 8,826 - Forks: 744

chris-short/DevOps-README.md

What to Read to Learn More About DevOps

Size: 7.23 MB - Last synced: 19 days ago - Pushed: over 1 year ago - Stars: 450 - Forks: 27

dastergon/wheel-of-misfortune

A role-playing game for incident management training

Language: HTML - Size: 184 KB - Last synced: about 1 month ago - Pushed: 3 months ago - Stars: 158 - Forks: 41

equinixmetal-helm/k8s-otel-collector

A repo holding the Kubernetes deployment manifests for otel-collector

Size: 196 KB - Last synced: 15 days ago - Pushed: 15 days ago - Stars: 1 - Forks: 2

league3236/begindevops

DevOps E / SRE 업무를 하면서 전문성을 갖추기 위하여 공부한 자료를 업로드하는 공간입니다. 개인적인 공부이지만 참고할 부분이 될 수 있었으면 좋겠습니다.

Language: Go - Size: 118 MB - Last synced: 4 months ago - Pushed: over 1 year ago - Stars: 2 - Forks: 0

manuelcoppotelli/manuelcoppotelli.github.io

I'm a Professional Mistake Avoider, a.k.a. Strategic Advisor.

Language: Svelte - Size: 20.8 MB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 0 - Forks: 0

exajobs/sre-collection

An ongoing & curated collection of awesome SRE software and tools, libraries and frameworks, engineering books and blogs, philosophical principles, technical guidelines, practical tools about the field of Site Reliablity Engineering (SRE)

Size: 1.31 MB - Last synced: 3 months ago - Pushed: over 2 years ago - Stars: 20 - Forks: 4

zeroc0d3lab/awesome-scalability Fork of binhnguyennus/awesome-scalability

:bookmark: Daily-updated reading list for designing High Scalability :cherries:, High Availability :fire:, High Stability :mount_fuji: back-end systems - Pull requests are greatly welcome :two_men_holding_hands: I hope you will find this project helpful :four_leaf_clover: Please help me share it to more and more people :heart: Thank you - 谢谢 - धन्यवाद - ধন্যবাদ - Спасибо - شكرا - Merci - Gracias - Danke - Cảm ơn! :bow:

Size: 1.32 MB - Last synced: 2 days ago - Pushed: almost 2 years ago - Stars: 14 - Forks: 5

komlog-io/komlogd

The agent of Komlog, a PaaS for helping observability teams to better understand their systems.

Language: Python - Size: 680 KB - Last synced: 5 days ago - Pushed: over 6 years ago - Stars: 5 - Forks: 2

dastergon/CardsAgainstReliability Fork of CardsAgainstCryptography/CAC

A party card game for engineers caring about reliability. Based on Cards Against Humanity.

Language: TeX - Size: 7.77 MB - Last synced: about 1 month ago - Pushed: over 5 years ago - Stars: 95 - Forks: 3

mikeroyal/OpenShift-Guide

OpenShift Guide. Learn about the Red Hat OpenShift Container Platform, Data Science, Code Ready Containers, Podman, Buildah, and Kubernetes.

Language: Python - Size: 247 KB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 130 - Forks: 28

jacob-hudson/ideal-enigma

SRE Sandbox

Language: Go - Size: 8.79 KB - Last synced: 5 months ago - Pushed: almost 7 years ago - Stars: 0 - Forks: 0

AnthonyByansi/SRE-Mastery-Guide

A comprehensive open-source guide for mastering Site Reliability Engineering (SRE) with a focus on Azure, Azure DevOps, and DevOps best practices, from foundational principles to advanced expertise.

Language: Shell - Size: 66.4 KB - Last synced: 4 months ago - Pushed: 8 months ago - Stars: 2 - Forks: 0

Mregojos/Roadmap-Data-ML-AI-Cloud-DevOps-SRE

Roadmap (Data/ML/AI/Cloud/DevOps)

Size: 117 KB - Last synced: 4 months ago - Pushed: 4 months ago - Stars: 0 - Forks: 0

dastergon/availability-calculator

Calculate how much downtime should be permitted in your Service Level Agreement or Objective

Language: HTML - Size: 69.3 KB - Last synced: about 1 month ago - Pushed: over 3 years ago - Stars: 63 - Forks: 12

dastergon/error-budget-calculator

Calculate the tolerable downtime of your service

Language: HTML - Size: 14.6 KB - Last synced: about 1 month ago - Pushed: almost 6 years ago - Stars: 5 - Forks: 5

rishiloyola/SRE-Interviews

Curated list of good SRE interview questions.

Size: 25.4 KB - Last synced: 7 months ago - Pushed: almost 2 years ago - Stars: 342 - Forks: 87

operate-first/operations

The sig-operations repository.

Size: 106 KB - Last synced: 3 months ago - Pushed: over 2 years ago - Stars: 21 - Forks: 24

chiaen/sre-book-in-audio

Google Site Reliability Engineering book converted in audio

Size: 166 MB - Last synced: 7 months ago - Pushed: about 7 years ago - Stars: 160 - Forks: 53

LinuxBozo/resume

Resume of M. Adam Kendall, Engineering Leader

Size: 1.11 MB - Last synced: 8 months ago - Pushed: 8 months ago - Stars: 1 - Forks: 1

figwasp/figwasp

Keep Kubernetes Deployments up-to-date with the `latest` container images

Language: Go - Size: 226 KB - Last synced: 4 months ago - Pushed: about 2 years ago - Stars: 4 - Forks: 0

marceloboeira/sre

📚 Index for my study topics

Language: Makefile - Size: 11.6 MB - Last synced: 9 months ago - Pushed: 9 months ago - Stars: 57 - Forks: 8

abrunner94/maia

Maia is a CLI that allows you to execute remote commands on multiple machines at once.

Language: Go - Size: 23.4 KB - Last synced: 9 months ago - Pushed: almost 5 years ago - Stars: 1 - Forks: 0

Codehunter-py/90DaysOfDevOps Fork of MichaelCade/90DaysOfDevOps

I embarked on this journey on April 1st, 2022. My goal is to both learn and actively contribute to this project. With my existing DevOps background and knowledge, I aim to grasp the intricacies of concepts in this field as part of the #90DaysOfDevOps initiative.

Language: Shell - Size: 288 MB - Last synced: 7 months ago - Pushed: 7 months ago - Stars: 0 - Forks: 1

at15/sre-handbook

A combination of introduction to operating system and computer network

Size: 23.4 KB - Last synced: 10 months ago - Pushed: over 7 years ago - Stars: 11 - Forks: 0

bparli/goavail

Endpoint monitoring and DNS failover agent written in Go

Language: Go - Size: 17.8 MB - Last synced: 4 months ago - Pushed: over 6 years ago - Stars: 6 - Forks: 0

angelopoerio/oom-notifier

Notify about oomed processes reporting full command line

Language: Rust - Size: 61.5 KB - Last synced: 10 months ago - Pushed: over 2 years ago - Stars: 16 - Forks: 1

arivictor/cronk

Cronk - A simple cron format helper

Language: Go - Size: 2.1 MB - Last synced: 9 months ago - Pushed: 10 months ago - Stars: 0 - Forks: 0

shantoroy/site-reliability-engineering-101

This GitHub repository contains a comprehensive tutorial on Site Reliability Engineering (SRE), covering topics such as SLAs, SLOs, SLIs, Chaos Engineering, monitoring, alerting, and much more. It also includes a bonus content on SRE best practices. Follow along with the #100daysofSRE challenge and improve your reliability engineering skills.

Size: 17.6 KB - Last synced: 9 months ago - Pushed: about 1 year ago - Stars: 2 - Forks: 0

danrl/skinny

The Skinny Distributed Lock Service

Language: Go - Size: 699 KB - Last synced: 11 months ago - Pushed: almost 4 years ago - Stars: 86 - Forks: 14

Stephen-RA-King/Stephen-RA-King

An experienced Python Developer

Size: 942 KB - Last synced: 5 months ago - Pushed: 5 months ago - Stars: 0 - Forks: 0

mauricioabreu/sre-hands-on

Language: Go - Size: 68.4 KB - Last synced: about 1 month ago - Pushed: 11 months ago - Stars: 3 - Forks: 0

dmyerscough/miruoncall

Oncall Dashboard

Language: Python - Size: 10.1 MB - Last synced: about 1 month ago - Pushed: about 1 year ago - Stars: 2 - Forks: 1

guerzon/sreafterhours

Writeups on Kubernetes, GKE, Google Cloud Platform, and site reliability engineering.

Language: HCL - Size: 739 KB - Last synced: 12 months ago - Pushed: 12 months ago - Stars: 0 - Forks: 0

gremlin/sre-tools

A collection of SRE tools

Size: 4.88 KB - Last synced: 29 days ago - Pushed: over 4 years ago - Stars: 60 - Forks: 7

exajobs/devops-collection

Welcome To The World of DevOps. An ongoing & curated collection of awesome software, libraries, learning tutorials, tools and resources and cool stuff about DevOps.

Language: Python - Size: 1.83 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 57 - Forks: 17

QAInsights/Performance-Engineers-DevOps

This repository helps performance testers and engineers who wants to dive into DevOps and SRE world.

Size: 1.77 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 41 - Forks: 23

MrSaints/terraform-provider-cabot 📦

[INACTIVE] Terraform provider for Arachnys' Cabot. Create, manage, and manipulate status checks, and alerts for services.

Language: Go - Size: 16.6 KB - Last synced: about 1 month ago - Pushed: over 6 years ago - Stars: 5 - Forks: 4

brootware/sitemon

An agent to monitor the uptime of multiple sites and response time without any external libraries.

Language: Python - Size: 70.3 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 1 - Forks: 0

last9/site-reliability-tools

Map of Tools for Software Observability, Reliability & Monitoring

Size: 2.93 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 1 - Forks: 0

sd-tang/SRE-StarterPack

Inspired by Google's Site Reliability Engineering practice, I try to build a business case for this to be introduced into my current workplace.

Size: 56.6 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 2 - Forks: 0

angelmaroco/litmus-chaos-engineering-workshop

Workshop Ingeniería del caos sobre Kubernetes con Litmus

Language: Shell - Size: 4.42 MB - Last synced: about 1 year ago - Pushed: almost 3 years ago - Stars: 2 - Forks: 0

monitoring-tutorials/prometheus-focused

I am creating this repository to learn, practice and share the knowledge with all the world. Feel free to help me on this, make sure you have created a pull request to contribute

Size: 105 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 1 - Forks: 0

skyzyx/engineering-for-site-reliability

Overall map of topics to cover for my “Engineering for Site Reliability” blog series.

Size: 1.43 MB - Last synced: about 1 year ago - Pushed: over 5 years ago - Stars: 5 - Forks: 0

githubfoam/gremlin-travisci

gremlin chaos engineering

Size: 19.5 KB - Last synced: about 1 year ago - Pushed: almost 2 years ago - Stars: 0 - Forks: 0

githubfoam/chaos-k8s-sandbox

chaos engineering kubernetes pipeline

Language: Shell - Size: 33.2 KB - Last synced: 12 months ago - Pushed: almost 2 years ago - Stars: 1 - Forks: 0

githubfoam/chaostoolkit-githubactions

chaostoolkit githubactions chaos engineering

Size: 18.6 KB - Last synced: 12 months ago - Pushed: almost 2 years ago - Stars: 0 - Forks: 0

githubfoam/gremlin-githubactions

gremlin githubactions chaos engineering devsecops site-reliability-engineering observability

Size: 16.6 KB - Last synced: about 1 year ago - Pushed: almost 2 years ago - Stars: 0 - Forks: 0

GreggSchofield/terraformed-pagerduty

Repository showing how PagerDuty can be managed using Terraform, Terraform Cloud as a remote backend and GitHub actions for a CI/CD pipeline

Language: HCL - Size: 13.7 KB - Last synced: about 1 year ago - Pushed: almost 2 years ago - Stars: 0 - Forks: 0

brunopadz/memcached-ok

Simple way to test connection to memcached

Language: Go - Size: 1.48 MB - Last synced: about 1 month ago - Pushed: about 2 years ago - Stars: 1 - Forks: 0

byn3/holberton-system_engineering-devops

Holberton's DevOps/ SRE curriculum. Projects and code that focus on Bash scripting, system design, automation, web infrastructure, web servers, Linux, Vagrant, and Vim. View the READMEs inside for more descriptions of each

Language: Shell - Size: 10.7 MB - Last synced: about 1 year ago - Pushed: about 5 years ago - Stars: 0 - Forks: 5

dastergon/py-deterministic-subsetting

Deterministic Subsetting as defined in the SRE book

Language: Python - Size: 1.95 KB - Last synced: about 1 month ago - Pushed: over 2 years ago - Stars: 4 - Forks: 5

choyiny/reliable_blog

A simple blog stack to demonstrate Google's Site Reliability Engineering principles.

Language: Ruby - Size: 2.42 MB - Last synced: 9 months ago - Pushed: about 1 year ago - Stars: 5 - Forks: 2

ualali/udacity-shop Fork of udacity/nd064_capstone_starter

Capstone project of the Udacity's Cloud Native Application Architecture Nanodegree

Language: Go - Size: 7.25 MB - Last synced: 11 months ago - Pushed: over 2 years ago - Stars: 0 - Forks: 1

lreimer/go-for-operations

Demo repository for "Go for Operations" workshop.

Language: Go - Size: 3.42 MB - Last synced: 4 months ago - Pushed: over 1 year ago - Stars: 3 - Forks: 5

Knighton-Dev/SharpenUp

A .Net Standard library for working with the Uptime Robot API.

Language: C# - Size: 220 KB - Last synced: 24 days ago - Pushed: over 1 year ago - Stars: 1 - Forks: 1

TheJokersThief/go-enc

External Node Classifier written in Go

Language: Go - Size: 2.96 MB - Last synced: 19 days ago - Pushed: almost 5 years ago - Stars: 0 - Forks: 0

vacovsky/poolse

Control health checks and toggle upstream node status in load balancers with ease.

Language: Go - Size: 20.1 MB - Last synced: about 1 year ago - Pushed: almost 7 years ago - Stars: 4 - Forks: 0

lukebrady/resourced

Great resources for learning Software and Site Reliability Engineering.

Size: 5.86 KB - Last synced: about 1 year ago - Pushed: about 6 years ago - Stars: 0 - Forks: 0