Muestra las diferencias entre dos versiones de la página.
| Ambos lados, revisión anterior Revisión previa Próxima revisión | Revisión previa | ||
|
wiki2:global [2020/06/21 14:03] alfred [Scaling] |
wiki2:global [2021/02/18 14:37] (actual) |
||
|---|---|---|---|
| Línea 82: | Línea 82: | ||
| </code> | </code> | ||
| ===== Python ===== | ===== Python ===== | ||
| + | |||
| + | ==== Examples ==== | ||
| + | |||
| + | * [[https://stackoverflow.com/a/24846766|Fibonacci]] | ||
| ==== Generators ==== | ==== Generators ==== | ||
| Línea 91: | Línea 95: | ||
| * [[wiki2:python3#asyncio_package|asyncio package]] | * [[wiki2:python3#asyncio_package|asyncio package]] | ||
| * [[wiki2:python:notes#advanced_asyncio|Advanced]] | * [[wiki2:python:notes#advanced_asyncio|Advanced]] | ||
| + | |||
| + | ===== Coding ===== | ||
| + | |||
| + | ==== SOLID ==== | ||
| + | |||
| + | |||
| + | ==== GRASP ==== | ||
| + | ===== Testing ===== | ||
| ===== Diseño de sistemas ===== | ===== Diseño de sistemas ===== | ||
| + | |||
| + | * [[https://github.com/donnemartin/system-design-primer|Github article]] - {{ :wiki2:notes:system-design-primer-master.zip |200622 status}} | ||
| + | |||
| + | We will have to take into account the next three elements: | ||
| + | * External request from an external client (an HTTP request from a browser, etc). | ||
| + | * Your code running in some container (a Django app running on mod_wsgi, a Python script listening to RabbitMQ, etc). | ||
| + | * Pieces of infrastructure (MySQL, Redis, RabbitMQ, etc). | ||
| + | |||
| ==== Scaling ==== | ==== Scaling ==== | ||
| Línea 100: | Línea 120: | ||
| * Horizontal: You scale by adding more machines into your pool of resources. | * Horizontal: You scale by adding more machines into your pool of resources. | ||
| + | |||
| + | === Load balancing pattern === | ||
| + | A load balancer (software or hardware) is placed just after the request arrives to our servers. It chooses between workers which one will process the request. | ||
| + | |||
| + | * HAProxy | ||
| + | * NGINX | ||
| + | * Traefik | ||
| + | |||
| + | === Caching pattern === | ||
| + | |||
| + | Precalculate results, pre-generate expensive indexes, and storing copies of frequently accessed data in a faster backend. | ||
| + | |||
| + | There are two types of cache: | ||
| + | * **Application caching** requires explicit integration in the application code itself. Usually it will check if a value is in the cache; if not, retrieve the value from the database; then write that value into the cache. | ||
| + | * **Database caching**, when you configure the database cache. | ||
| + | |||
| + | In-memory cache are software that stores the cache content on RAM (Redis, Memcache). | ||
| + | |||
| + | Another kind of cache which comes into play for sites serving large amounts of static media is the content distribution network (CDN). They can also provide geographic distribution. | ||
| + | |||
| + | Cache invalidation is the procedure to avoid inconsistencies between the updated data and the data stored into the cache. | ||
| + | |||
| + | * Memcache | ||
| + | * Varnish | ||
| + | * Cassandra | ||
| + | * Redis | ||
| + | |||
| + | === Scatter and gather pattern === | ||
| + | |||
| + | The dispatcher multicast the request to all workers of the pool. Each worker will compute a local result and send it back to the dispatcher, who will consolidate them into a single response and then send back to the client. This pattern is used in Search engines like Yahoo, Google to handle user's keyword search request ... etc. | ||
| + | |||
| + | === AMQP and message queues pattern === | ||
| + | |||
| + | Message queues allow your web applications to quickly publish messages to the queue, and have other consumers processes perform the processing outside the scope and timeline of the client request. | ||
| + | |||
| + | They allow you to create a separate machine pool for performing off-line processing rather than burdening your web application servers. | ||
| + | |||
| + | * RabbitMQ | ||
| + | |||
| + | === Map-reduce pattern === | ||
| ==== Distributed systems ==== | ==== Distributed systems ==== | ||
| Línea 105: | Línea 165: | ||
| * [[https://en.wikipedia.org/wiki/Fallacies_of_distributed_computing|Fallacies of distributed computing]] | * [[https://en.wikipedia.org/wiki/Fallacies_of_distributed_computing|Fallacies of distributed computing]] | ||
| ===== Bases de datos ===== | ===== Bases de datos ===== | ||
| + | ==== Replication ==== | ||
| + | |||
| + | Database replication is the frequent electronic copying data from a database in one computer or server to a database in another so that all users share the same level of information. The result is a distributed database in which users can access data relevant to their tasks without interfering with the work of others. The implementation of database replication for the purpose of eliminating data ambiguity or inconsistency among users is known as normalization. | ||
| + | |||
| + | Disadvantages: | ||
| + | |||
| + | * There is a potential for loss of data if the master fails before any newly written data can be replicated to other nodes. | ||
| + | * Writes are replayed to the read replicas. If there are a lot of writes, the read replicas can get bogged down with replaying writes and can't do as many reads. | ||
| + | * The more read slaves, the more you have to replicate, which leads to greater replication lag. | ||
| + | * On some systems, writing to the master can spawn multiple threads to write in parallel, whereas read replicas only support writing sequentially with a single thread. | ||
| + | * Replication adds more hardware and additional complexity. | ||
| + | |||
| + | |||
| + | === Master-slave replication === | ||
| + | |||
| + | The master serves reads and writes, replicating writes to one or more slaves, which serve only reads. Slaves can also replicate to additional slaves in a tree-like fashion. If the master goes offline, the system can continue to operate in read-only mode until a slave is promoted to a master or a new master is provisioned. | ||
| + | |||
| + | Disadvantages: | ||
| + | * Additional logic is needed to promote a slave to a master. | ||
| + | |||
| + | === Master-master replication === | ||
| + | |||
| + | Both masters serve reads and writes and coordinate with each other on writes. If either master goes down, the system can continue to operate with both reads and writes. | ||
| + | |||
| + | Disadvantages: | ||
| + | * You'll need some logic to determine where to write (ex. a load balancer). | ||
| + | * Most master-master systems are either loosely consistent (violating ACID) or have increased write latency due to synchronization. | ||
| + | * Conflict resolution comes more into play as more write nodes are added and as latency increases. | ||
| + | |||
| + | |||
| + | |||
| + | |||
| + | ==== Database partitioning ==== | ||
| + | |||
| + | Partitioning of relational data usually refers to decomposing your tables either row-wise (horizontally) or column-wise (vertically). | ||
| + | |||
| + | ==== Modified Preorder Tree Traversal (MPTT) ==== | ||
| + | |||
| + | MPTT is a technique for storing hierarchical data in a database. The aim is to make retrieval operations very efficient. | ||
| + | |||
| + | The trade-off for this efficiency is that performing inserts and moving items around the tree is more involved, as there's some extra work required to keep the tree structure in a good state at all times. | ||
| + | |||
| + | * https://www.sitepoint.com/hierarchical-data-database/ | ||
| + | * http://mikehillyer.com/articles/managing-hierarchical-data-in-mysql/ | ||
| + | * https://www.ibase.ru/files/articles/programming/dbmstrees/sqltrees.html | ||
| + | ===== Networking ===== | ||
| + | |||
| + | * IPC: Inter Process Communication. | ||
| + | * TCP/IP | ||
| + | |||
| + | ===== Security ===== | ||
| + | |||
| + | * CORS | ||
| + | |||
| + | |||
| + | ===== Questions to make ===== | ||
| + | |||
| + | * {{ :wiki2:notes:reverse-interview-master.zip |Reverse interview document}} | ||
| + | |||
| + | * Por qué el stack que usan? No como crítica pero -explicar los problemas de ese stack y ofrecer alternativas-. Entonces, qué les llevó a esa arquitectura? | ||
| + | * Cuánto tiempo hace que trabajan ahí? Por qué? Qué es lo que más te gusta de trabajar aquí? | ||
| + | * Tell me about the worst day (work-wise) you've had in the last six months. | ||
| + | * Cómo reaccionó la empresa durante el COVID? | ||
| + | * What brought you here? What keeps you here? What keeps you up at night? | ||
| + | |||
| + | In my experience it's important to not ask "easy" questions that can be answered by a simple yes/no. | ||
| + | * Instead of: Are you friendly to remote work? Ask: Will I be working with anyone who is remote, or who works from home on a regular basis? | ||
| + | * Instead of: Is the work life balance good? Ask: How responsive are people to emails/Slack over the weekends and after 6pm? | ||
| + | * Instead of: Can I have a good career path? Ask: Did any of your senior engineers start out as junior engineers here? | ||
| + | |||
| + | En la línea de lo anterior: si hay algún problema grave en un fin de semana, cual es la predisposición del equipo de quedarse. | ||
| + | |||
| + | <code> | ||
| + | The Joel Test: | ||
| + | Do you use source control? | ||
| + | Can you make a build in one step? | ||
| + | Do you make daily builds? | ||
| + | Do you have a bug database? | ||
| + | Do you fix bugs before writing new code? | ||
| + | Do you have an up-to-date schedule? | ||
| + | Do you have a spec? | ||
| + | Do programmers have quiet working conditions? | ||
| + | Do you use the best tools money can buy? | ||
| + | Do you have testers? | ||
| + | Do new candidates write code during their interview? | ||
| + | Do you do hallway usability testing? | ||
| + | </code> | ||