Community chat: https://t.me/hamster_kombat_chat_2
Twitter: x.com/hamster_kombat
YouTube: https://www.youtube.com/@HamsterKombat_Official
Bot: https://t.me/hamster_kombat_bot
Game: https://t.me/hamster_kombat_bot/
Last updated 3 months, 4 weeks ago
Your easy, fun crypto trading app for buying and trading any crypto on the market.
📱 App: @Blum
🆘 Help: @BlumSupport
ℹ️ Chat: @BlumCrypto_Chat
Last updated 3 months, 3 weeks ago
Turn your endless taps into a financial tool.
Join @tapswap_bot
Collaboration - @taping_Guru
Last updated 4 days, 1 hour ago
Architecture Weekly #125 Follow Up
How Grafana used Dapr to improve vulnerability scans ?♂️****
If you're buidling containers using 3rd party dependencies, you have to protect yourself from supply chain attacks. It might seem as easy as running a scan against a container only to discover later that scale makes things complicated. You need to add several sources of scanning requests, cache results for performance, handle errors. Grafan tried to solve it using Dapr engine, find what they designed.
#security
Evolving the Backend Storage for Platform Metrics ?♂️****Heroku stores and analyses metrics for it's client for enabling features like auto-scaling. A whole subsystem called MetaaS or Metrics-as-a-Service is designed specifically for this task, including Apache Kafka and Cassandra. While there are no issues with Kafka, operating a Cassandra cluster proved to be tedious. The team decided to migrate data to DynamoDB and after a year, made a conclusion it was the right choice. Find the reasoning inside.
QA myth busting: Quality can be measured ?****
Testing NFRs is part of an architects job, and another part is to correctly interpret the results. Having the goal of having low coupling or low cyclomatic complexity can screw over your product. Vitaly Sharovatov explores the nature of metrics in QA and explains the proper attitude to them in this longread.
The Importance of Using a Composite Metric to Measure Performance ?♂️****
To follow up on the metrics, let's consider an example of Indeed: they provide different scenarios of a client web page and demonstrate that you can not possible use a single metric to make a conclusion if the page was slow or not. Instead you need to use a compound metric for a more holistic picture.
Discord's Streaming Technology ?♂️****
Discord is a popular chat app with the group calls functionality. Discord also is relating to gaming, so streaming games on a call was a natural necessity. In their engineering blog they describe how the stream is coordinated between client apps and Discord servers, how WebRTC helps making it happen and how the performance is measured.
Big thanks to Nikita, Constantin, Anatoly, Oleksandr, Dima, Pavel B, Pavel, Robert, Roman, Iyri, Andrey, Lidia, Vladimir, August, Roman, Egor, Roman, Evgeniy, Nadia, Daria, Dzmitry, Mikhail, Nikita, Dmytro, Denis and Mikhail for supporting the newsletter. They receive early access to the articles, influence the content and participate in the closed group where we discuss the architecture problems. Join them at Patreon or Boosty!
CNCF
How Grafana used Dapr to improve vulnerability scans
Grafana open source software empowers users to query, visualize, alert on, and explore metrics, logs, and traces, regardless of their storage location. Grafana OSS equips users with tools to transform…
Architecture Weekly #125 Highlights
Kubernetes resiliency (RTO/RPO) in Multi-Cluster deployments ?♂️****
Main disaster recovery objectives are Recovery Time Objective - how much times it requires to get back up and Recovery Point Objective - how much data we are losing during the disaster. Kubernetes though with the StatefulSet feature can be considered a tool to achieve Zero RTO for single-cluster resiliency. Multi-cluster is a bigger challenge. Find what's about it in the article by Avesha.
Fault-tolerant Distributed Transactions Can be Fast and Simple ?****
There is a paper a new Unanimous two-phase-commit protocol which is used by Spanner, MongoDB, Yugabyte and others with conjuction of Paxos. This pair though requires double N plus 1 nodes to tolerate N failures(so 3 nodes for a single node failure), and U2PC claims to require only N+1. Murat digs deep into the paper in his blog post.
Making Impact as a Software Engineer ?****
And the second piece of content from my side! I conducted a webinar on making impact as a software engineer, and how the career growth is connected to business impact. Find the whole video here!
avesha.io
Kubernetes resiliency (RTO/RPO) in Multi-Cluster deployments
Ah Kubernetes! The panacea to all our DevOps challenges.
Interview with infamous Dylan Beattie!
YouTube
Software Craftsmanship with Dylan Beattie
Why companies hiring React developers instead of FinTech developers? How to understand if you need to search for a quick solution or actually study the problem? Dylan Beattie, an author of RockStar language, shares multiple funny stories on tech, conferences…
Architecture Weekly #115 - Follow-Up
XML Generation with Typescript. Java-style ?♂️****
And an article from myself! If you came a path from Java to Typescript for backend development, then you're probably missing the way Java serializes objects into XML. I got the same feeling, and decided to bring similar DX to Typescript. Find out what I managed to achieve!
Observability with AWS X-Ray ?♂️****
If you have a single lambda, you can be fine with just looking at cloudwatch logs. However, if you have 2 or three, figuring out issues in production become tricky. One way to battle it is integration tests with TestContainers. The other - using observability solutions, like AWS X-ray. Find an explanation what's that, how it's helpful and how to set it up for Node.js lambda writing to DynamoDB.
Building a Service Mesh in a Hybrid Environment ?♂️****
Interesting case from Quora. Historically, they had both Kubernetes workloads alongside with EC2 instances, which were not planned for migration. The setup of intercalls had it's limitations in scalability and operational usability, so they decided to migrate the solution to service mesh. Figure out what happened next!
Why Kubernetes needs an LTS ?****
Release cycles are important in Operating Systems, programming languages and indeed container orchestrator. This is a lightweight post explaning why Kubernetes may use LTS releases as upgrading a cluster can be comparably difficult as configuring a new one.
Use percentiles to analyze application performance ?****
Some non-functional requirements can be tested and told if they are satisfied. Like, if software can be run on different OS for portability. With performance a single measure won't tell you anything - that's why we need to talk about averages, medians and percentiles. Find an explanation of those terms.
Next week I will be on the vacation, so no newsletter issue next week. Grab a short break ?
Vladimir Ivanov Dev Blog
XML Generation with Typescript. Java-style.
Generating XML from TS can be cumbersome, but not necessarily. Find out how to do that Java style.
Architecture Weekly #115 - Highlights
Formal Methods: Just Good Engineering Practice? ?
This week TLA+ conference 2024 happened, and the keynote speaker - Marc Brooker - wrote a post about Formal Verification Methods, which are essential in large distributed systems. He is reasoning that formal methods are not only necessary, but actually make a good engineering practice. And I hope we all want to be good engineers.
#philosophy #distributedsystems
Postgres Aurora DB major version upgrade with minimal downtime ?♂️
Lyft's payment database size is 30 TB. Upgrading such a db in-place would take approx 30 minutes which can be pretty expensive. So Lyft's team decided to use Logical Replication to create a replica, so then they could later cut off the writes into the main DB, ensure that replication lag reduced to 0 and then redirect the writes to a new instance. More details inside!
The serverless illusion ?
While serverless technologies are amazing, allowing you to get increadibly fast, they also require specific understanding: freeing you up from operational knowledge, they demand grasping distributed systems. And that's not an easy tradeoff. Find a long-read in The Architect Elevator.
brooker.co.za
Formal Methods: Just Good Engineering Practice? - Marc's Blog
Architecture Weekly #114 - Follow-Up
How do distributed databases handle secondary indexes *?***
Primary indexes are used for sharding; however when you need to search for something without knowing the primary key, secondary indexes help. There are at least 4 strategies of working with secondary indexes in distributed databases, and guess what: this article will get you through them depicting pros and cons and mentioning the databases which use a particular strategy.
Optimizing Postgres Column Order ?
Performance tactics include a lot of design decisions, like caching, using CDN and indeed organizing data. A more obscure and unobvious optimization aligned with the data design comes from the data alignment on the disk level. Find out how you can save up to 20% of disk storage by simply... reordering the columns!
Consuming a Kafka Topic is Easy, Isn't It? ?♂️****
Kafka is an essential part of many real-world systems. So it should be quite obvious how to reliably consume a kafka topic. However, the topic is not so easy. Managing offsets, proper error handling and other issues require elaboration. That's why grab the article!
Should Terraform be applied before or after merge ?****
With Infrastrcture-as-Code there is a question when the changes should be applied - after getting merged to the main branch or before that. This article consider pros and cons of both strategies and provides recommendation of a mixed approach depending on criticality of the target environment.
How browsers work ?♂️****
Once I was interviewing at Amazon, and they asked what happens when you type in the URL in the browser. Indeed, I was able to tell about the DNS queries, caches, TLS handshake, html loading and DOM tree building. Find an extensive page explaining all of those steps and some more!
Big thanks to Nikita, Anatoly, Oleksandr, Dima, Pavel B, Pavel, Robert, Roman, Iyri, Andrey, Lidia, Vladimir, August, Roman, Egor, Roman, Evgeniy, Nadia, Daria, Dzmitry, Mikhail, Nikita, Dmytro, Denis and Mikhail for supporting the newsletter. They receive early access to the articles, influence the content and participate in the closed group where we discuss the architecture problems. They also see my daily updates on all the things I am working on. Join them at Patreon or Boosty!
Alexdebrie
How do distributed databases handle secondary indexes? A survey
Learn how different distributed databases handle secondary indexes, and the benefits and drawbacks of each approach.
Architecture Weekly #107 - Follow Up
Securing your MongoDB Atlas installation ?♂️****
I am doing a review for one of the companies and discovered their atlas installation does not use a virtual private network, does not have a white list, nothing. Knowing a connection string provides the access to the prod database. Please, don't do this. Study this guide in order to see, how you can secure your db connection.
Unorthodox intro to Kubernetes for Developers ?****
Kubernetes is everywhere, but how can you describe it to an engineer with just a couple of years of experience? Find an extremely clear explanation why do we need it and how it generally works.
The single-tenancy to multi-tenancy spectrum ?****
Have you ever faced a limit on the number of control pane requests of AWS Lambda? The folks who wrote this article did. It appeared as a single-tenant setup of their business. And here they discuss those limitations and explain that multi-tenant setup can be better in some cases due to a set of trade-offs, which you will also find.
Benchmaring PostrgeSQL connection poolers: PgBouncer, PgCat and Supavisor ?♂️****
Connection pool is an idea to reuse the connection from an application to a database, because it is relatively expensive to do per each request. For PostgreSQL there are several connection poolers, so you might want to understand which one suits best your particular case. Here's the article which lays down the comparison of 3 poolers for a small, medium and large number of clients. Enjoy the graphs!
Tacking the challenges of using event-driven architecture in a billing system by Thoughtworks ?♂️****
Billing is a business capability which every business should have: you need to charge your customer in one way or another. Thoughtworks shared an article how they created a new billing systems based on the event-driven architecture. Find what challenges they faced.
MongoDB
MongoDB Developer Data Platform With Strong Security Capabilities
Safeguard your data with strong security defaults on the MongoDB data platform. Meet stringent requirements with robust operational and security controls.
Architecture Weekly #105 - Highlights
Data Structures for Data-Intensive Applications ?****
This paper provides an insightful guide on choosing and designing data structures for managing large amounts of data efficiently. It explains the trade-offs between speed, storage, and updates, helping developers and architects make informed decisions for their systems. Through clear concepts like the RUM framework, it highlights how understanding data usage and hardware can lead to better performance in applications ranging from databases to operating systems.
#db #performance
Fundamentals of Availability for System Design Interview ?****
One of the requirements you need to fullfil during an interview is availability. There are several items you need to understand about what does it mean for business, how we measure availability and how to design for it. Checkout here the fundamentals, and remember, if you need help preparing for such an interview, you can always request a mock system design here.
connect() - why are you so slow? ?****
While this post by Cloudflare explains the mechanics of choosing a source port while establishing a tcp connection, it mostly amazes me how the deep network mechanics can affect the overall system performance much more than the design decisions.
Architecture Weekly #98 - Follow-Up
AI Coding AssitantsThis week was rich on interviews, so grab an exclusive interview with Anton Arhipov, Developer Advocate at JetBrains, as we delve into the groundbreaking realm of AI Coding Assistants. In this enlightening conversation, Anton sheds light on how these intelligent tools are reshaping the landscape of software development. From boosting productivity to revolutionizing code quality, discover how JetBrains is at the forefront of this technological evolution. Whether you're a seasoned developer or just intrigued by the future of AI in coding, this interview offers valuable insights into the next wave of development tools that are set to transform the industry.
Recursive embedding and clustering ?****Understanding users is crucial to every organization who treats themsevles as data-driven, and Spotify is no exception. However, the task itself is challenging even for season data scientists. In this article, Spotify engineers explain how they achieved manageable, understandable data which they can reason about. Mentions of recent improvents in Data Science itself included!
#dataengineering
Database Fundamentals ?****Let's continue on Database topic! Find another post which summarizes the Database Internals and Designing Data Intensive Applications together in a long read covering the Indexes, Isolation Levels, LSM trees and many more!
How Pinterest scaled to 11 million users with only 6 engineers ?****My favorite topic of Frugal Architecture! Pinterest shows how they were evolving their system to support the growth of the user base. Simple technologies, vertical and horizontal scaling, database sharding supported by only 6 engineers at the level of 11 million users and some other tactics in the blog post!
Sliding window rate limits in Distributed Systems ?♂️****Grab, like Uber, sends the marketing communications to their users, which is around 270 millions. But they have to carefully balance the marketing load so that users won't churn because of it. Interesting task to solve at scale. Grab leveraged Redis, Bloom Filters and Roaring Bitmaps to handle it! Details inside.
Update on the CAP theorem ?****You all know this Consistency-Availability-Partial Tolerance triad. You might also note that it gets it's fair share of criticism. Find a twitter note on why CAP theorem should actually get an improvement.
YouTube
Unveiling the Future of Code: Anton Arhipov Discusses AI Coding Assistants
Join us on Architecture Weekly Channel for an exclusive interview with Anton Arhipov, Developer Advocate at JetBrains, as we delve into the groundbreaking realm of AI Coding Assistants. In this enlightening conversation, Anton sheds light on how these intelligent…
Community chat: https://t.me/hamster_kombat_chat_2
Twitter: x.com/hamster_kombat
YouTube: https://www.youtube.com/@HamsterKombat_Official
Bot: https://t.me/hamster_kombat_bot
Game: https://t.me/hamster_kombat_bot/
Last updated 3 months, 4 weeks ago
Your easy, fun crypto trading app for buying and trading any crypto on the market.
📱 App: @Blum
🆘 Help: @BlumSupport
ℹ️ Chat: @BlumCrypto_Chat
Last updated 3 months, 3 weeks ago
Turn your endless taps into a financial tool.
Join @tapswap_bot
Collaboration - @taping_Guru
Last updated 4 days, 1 hour ago