Skip to content

Commit 8dde3b2

Browse files
committed
_
deployment styles, hadoop ecosystem, azure cloud practices urls, google sre, best kubernetes tools, data engineering & data science vocab
1 parent 376f963 commit 8dde3b2

File tree

4 files changed

+88
-5
lines changed

4 files changed

+88
-5
lines changed

README.md

Lines changed: 88 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,17 @@ This mindmap created by `https://app.mindmapmaker.org/`
5959
- [Azure Cloud Adoption Framework :CAF](https://learn.microsoft.com/en-gb/azure/cloud-adoption-framework/): organization-wide adoption guidance
6060
- [Azure Well-architected Framework :WAF](https://learn.microsoft.com/en-us/azure/well-architected/): workload-focussed design and continuous improvement guidance
6161
- [Azure Architecture Center :AAC](https://learn.microsoft.com/en-us/azure/well-architected/service-guides/?product=popular): architecture patterns and reference architectures
62+
- [Best practices in cloud applications](https://learn.microsoft.com/en-us/azure/architecture/best-practices/index-best-practices)
63+
- [Cloud Design Patterns](https://learn.microsoft.com/en-us/azure/architecture/patterns/)
64+
- [Landing zone](https://learn.microsoft.com/en-us/azure/architecture/landing-zones/azure-virtual-desktop/design-guide?tabs=baseline)
65+
- Abstractly speaking, a landing zone helps you plan for and design an Azure deployment, by conceptualizing a designated area for placement and integration of resources. There are two types of landing zones:
66+
1. `platform landing zone`: provides centralized enterprise-scale foundational services for workloads and applications.
67+
2. `application landing zone`: provides services specific to an application or workload.
68+
- [Google SRE Handbook](https://sre.google/sre-book/monitoring-distributed-systems/#xref_monitoring_golden-signals)
69+
- `Latency` is the response time of your application, usually expressed in milliseconds
70+
- `Throughput` is how many transactions per second or minute your application can handle
71+
- `Errors` is usually measured in a percent of
72+
- `Saturation` is the ability of your application to use the available CPU and Memory
6273

6374
---
6475

@@ -81,6 +92,10 @@ This mindmap created by `https://app.mindmapmaker.org/`
8192

8293
- [Substack Leaderboard](https://substack.com/browse/technology): Newsletter
8394

95+
---
96+
97+
- [Best Kubernetes Tools](https://bluelight.co/blog/best-kubernetes-tools): Bluelight Consulting
98+
8499
## Engineering blog
85100

86101
- [AWS Architecture Blog](https://aws.amazon.com/blogs/architecture/)
@@ -225,12 +240,38 @@ This mindmap created by `https://app.mindmapmaker.org/`
225240
</details>
226241

227242
- API Gateway vs Load Balancer
228-
- **API Gateway**: Manages access to backend services, handles tasks like rate-limiting, authentication, logging, and security policies.
229-
- **Load Balancer**: Distributes network traffic across multiple servers for high availability and even load distribution.
243+
- API Gateway: Manages access to backend services, handles tasks like rate-limiting, authentication, logging, and security policies.
244+
- Load Balancer: Distributes network traffic across multiple servers for high availability and even load distribution.
245+
246+
- Data engineering & Data Scientists Vocab 101 [ref](https://x.com/SeattleDataGuy/status/1753950189314810358?s=20)
247+
248+
<details>
230249

231-
- Data engineering Vocab 101 [ref](https://x.com/SeattleDataGuy/status/1753950189314810358?s=20)
250+
<summary>Expand</summary>
251+
🔹 Data engineering Vocab 101
232252

233-
<img src="files/data-engineering-101.jpg" alt="Data engineering 101" width="400"/>
253+
[ref](https://x.com/SeattleDataGuy/status/1753950189314810358?s=20)
254+
255+
<img src="files/data-engineering-101.jpg" alt="Data engineering 101" width="400"/>
256+
257+
🔹 75 Key Terms That Data Scientists Remember by Heart
258+
259+
[ref](https://www.blog.dailydoseofds.com/p/75-key-terms-that-data-scientists)
260+
261+
<img src="files/de01.png" alt="Data engineering 01" width="400"/>
262+
263+
🔹 A Comprehensive NumPy Cheat Sheet Of 40 Most Used Methods
264+
265+
[ref](https://www.blog.dailydoseofds.com/p/a-comprehensive-numpy-cheat-sheet)
266+
267+
<img src="files/de02.png" alt="Data engineering 02" width="400"/>
268+
269+
🔹 15 Pandas ↔ Polars ↔ SQL ↔ PySpark Translations
270+
271+
[ref](https://www.blog.dailydoseofds.com/p/15-pandas-polars-sql-pyspark-translations)
272+
273+
<img src="files/de03.png" alt="Data engineering 03" width="400"/>
274+
</details>
234275

235276
- DevOps, Platform engineering and SRE (site reliability engineering) [ref](https://www.splunk.com/en_us/blog/learn/sre-vs-devops-vs-platform-engineering.html)
236277

@@ -305,7 +346,7 @@ This mindmap created by `https://app.mindmapmaker.org/`
305346

306347
<summary>SSO workflow, Types of SSO, SSO Implementations</summary>
307348

308-
🔹SSO workflow: Identoty Provider (IdP), Service Provider (SP), SSO Server
349+
🔹SSO workflow: Identity Provider (IdP), Service Provider (SP), SSO Server
309350
- IdP: Central Authentication server e.g., Google
310351
- SP: Individual Applications rely on SSO e.g, Trello
311352
- SSO Server: Bridge between IdP and SPs
@@ -324,5 +365,47 @@ This mindmap created by `https://app.mindmapmaker.org/`
324365

325366
🔹SSO Implementations: Microsoft Entra ID (FKA Micorsoft Active Directory), Okta, Ping Identity, OneLogin, Auth0
326367

368+
</details>
369+
370+
- Deployment Styles: Blue/Green, Canary, and A/B
371+
372+
<details>
327373

374+
<summary>Blue/Green, Canary, A/B</summary>
375+
376+
🔹Blue/Green Deployment: Two identical environments, "Blue" and "Green". Deploy new version in inactive environment, test, then switch users to it. For example, AWS supports blue/green deployment strategies including Elastic Beanstalk, OpsWorks, CloudFormation, CodeDeploy, and Amazon ECS.
377+
378+
🔹Canary Deployment: Roll out new version to a small group of users, monitor feedback, then do a full-scale release.
379+
380+
🔹A/B Testing: Compare two versions of a webpage or app to see which performs better. A typical example of A/B testing is website usability testing.
381+
382+
</details>
383+
384+
- Flaky Test: A Flaky Test is a test that sometimes passes and sometimes fails, despite no changes in the code. Causes can include poorly written tests, async waits, test order dependency, and concurrency issues. They can slow down CI/CD pipelines and cause issues for end users. [ref](https://github.com/jmicco/JaSST_tutorial)
385+
386+
- Hadoop Ecosystem
387+
<details>
388+
<summary>Hadoop vs Azure, AWS, GCP</summary>
389+
390+
🔹1. **HDFS (File Storage)**: Azure Data Lake Storage, Amazon S3, Google Cloud Storage
391+
392+
🔹2. **YARN (Resource Management)**: No direct equivalent in Azure, AWS, GCP
393+
394+
🔹3. **MapReduce (Data Processing)**: HDInsight, Amazon EMR, Google Cloud Dataproc
395+
396+
🔹4. **Spark (Fast Data Processing)**: Databricks, Spark in HDInsight, Azure Synapse Analytics, Amazon EMR, Google Cloud Dataproc
397+
398+
🔹5. **PIG, HIVE (Query Data)**: HDInsight, Azure Synapse Analytics, Amazon EMR, Google Cloud Dataproc
399+
400+
🔹6. **HBase (NoSQL DB)**: Azure Cosmos DB, HBase on a virtual machine (VM), HBase in Azure HDInsight, Amazon DynamoDB, Google Cloud Bigtable
401+
402+
🔹7. **Mahout, Spark MLLib (ML Libraries)**: Databricks, Amazon SageMaker, No direct equivalent in GCP
403+
404+
🔹8. **Solar, Lucene (Search/Index)**: Azure Cognitive Search, Amazon CloudSearch, Google Cloud Search
405+
406+
🔹9. **Zookeeper (Cluster Management)**: No direct equivalent in Azure, Amazon Managed Apache ZooKeeper, No direct equivalent in GCP
407+
408+
🔹10. **Oozie (Job Scheduling)**: Azure Data Factory, AWS Step Functions, Google Cloud Composer
328409
</details>
410+
411+

files/de01.png

993 KB
Loading

files/de02.png

1.02 MB
Loading

files/de03.png

1.33 MB
Loading

0 commit comments

Comments
 (0)