Interview Notes

Academic subjects

Complexity

Complexity and Big O notation

O(1) — Constant Time
Given an input of size n, it only takes a single step for the algorithm to accomplish the task.
O(log n) — Logarithmic time
Given an input of size n, the number of steps it takes to accomplish the task are decreased by some factor with each step.
O(n) — Linear Time
Given an input of size n, the number of of steps required is directly related (1 to 1)
O(n²) — Quadratic Time
Given an input of size n, the number of steps it takes to accomplish a task is square of n.
O(C^n) — Exponential Time
Given an input of size n, the number of steps it takes to accomplish a task is a constant to the n power (pretty large number).

let n = 16;
O (1) = 1 step "(awesome!)"
O (log n) = 4 steps  "(awesome!)" -- assumed base 2
O (n) = 16 steps "(pretty good!)"
O(n^2) = 256 steps "(uhh..we can work with this?)"
O(2^n) = 65,536 steps "(...)"

Constant example

function isFriend(name){ //similar to knowing the index in an Array 
  return friends[name]; 
};
isFriend('Mark') // returns True and only took one step
function add(num1,num2){ // I have two numbers, takes one step to return the value
 return num1 + num2
}

Logarithmic example

//You decrease the amount of work you have to do with each step
function thisOld(num, array){
  var midPoint = Math.floor( array.length /2 );
  if( array[midPoint] === num) return true;
  if( array[midPoint] < num ) --> only look at second half of the array
  if( array[midpoint] > num ) --> only look at first half of the array
  //recursively repeat until you arrive at your solution

}
thisOld(29, sortedAges)

Linear example

//The number of steps you take is directly correlated to the your input size
function addAges(array){
  var sum = 0;
  for (let i=0 ; i < array.length; i++){  //has to go through each value
    sum += array[i]
  }
 return sum;
}

Quadratic example

//The number of steps you take is your input size squared
function addedAges(array){
  var addedAge = [];
    for (let i=0 ; i < array.length; i++){ //has to go through each value
      for(let j=i+1 ; j < array.length ; j++){ //and go through them again
        addedAge.push(array[i] + array[j]);
      }
    }
  return addedAge;
}
addedAges(sortedAges);
//Nested for loops. If one for loop is linear time (n)
//Then two nested for loops are (n * n) or (n^2) Quadratic!

Exponential example

//The number of steps it takes to accomplish a task is a constant to the n power
//Thought example: Trying to find every combination of letters for a password of length n

Python

Examples

Fibonacci

Generators

Generators

AsyncIO

Coding

SOLID

GRASP

Testing

Diseño de sistemas

Github article - 200622 status

We will have to take into account the next three elements:

External request from an external client (an HTTP request from a browser, etc).
Your code running in some container (a Django app running on mod_wsgi, a Python script listening to RabbitMQ, etc).
Pieces of infrastructure (MySQL, Redis, RabbitMQ, etc).

Scaling

Two types of scaling:

Vertical: You scale by adding more power (CPU, RAM) to your existing machine.
Horizontal: You scale by adding more machines into your pool of resources.

Load balancing pattern

A load balancer (software or hardware) is placed just after the request arrives to our servers. It chooses between workers which one will process the request.

HAProxy
NGINX
Traefik

Caching pattern

Precalculate results, pre-generate expensive indexes, and storing copies of frequently accessed data in a faster backend.

There are two types of cache:

Application caching requires explicit integration in the application code itself. Usually it will check if a value is in the cache; if not, retrieve the value from the database; then write that value into the cache.
Database caching, when you configure the database cache.

In-memory cache are software that stores the cache content on RAM (Redis, Memcache).

Another kind of cache which comes into play for sites serving large amounts of static media is the content distribution network (CDN). They can also provide geographic distribution.

Cache invalidation is the procedure to avoid inconsistencies between the updated data and the data stored into the cache.

Memcache
Varnish
Cassandra
Redis

Scatter and gather pattern

The dispatcher multicast the request to all workers of the pool. Each worker will compute a local result and send it back to the dispatcher, who will consolidate them into a single response and then send back to the client. This pattern is used in Search engines like Yahoo, Google to handle user's keyword search request … etc.

AMQP and message queues pattern

Message queues allow your web applications to quickly publish messages to the queue, and have other consumers processes perform the processing outside the scope and timeline of the client request.

They allow you to create a separate machine pool for performing off-line processing rather than burdening your web application servers.

RabbitMQ

Map-reduce pattern

Distributed systems

Fallacies of distributed computing

Bases de datos

Replication

Database replication is the frequent electronic copying data from a database in one computer or server to a database in another so that all users share the same level of information. The result is a distributed database in which users can access data relevant to their tasks without interfering with the work of others. The implementation of database replication for the purpose of eliminating data ambiguity or inconsistency among users is known as normalization.

Disadvantages:

There is a potential for loss of data if the master fails before any newly written data can be replicated to other nodes.
Writes are replayed to the read replicas. If there are a lot of writes, the read replicas can get bogged down with replaying writes and can't do as many reads.
The more read slaves, the more you have to replicate, which leads to greater replication lag.
On some systems, writing to the master can spawn multiple threads to write in parallel, whereas read replicas only support writing sequentially with a single thread.
Replication adds more hardware and additional complexity.

Master-slave replication

The master serves reads and writes, replicating writes to one or more slaves, which serve only reads. Slaves can also replicate to additional slaves in a tree-like fashion. If the master goes offline, the system can continue to operate in read-only mode until a slave is promoted to a master or a new master is provisioned.

Disadvantages:

Additional logic is needed to promote a slave to a master.

Master-master replication

Both masters serve reads and writes and coordinate with each other on writes. If either master goes down, the system can continue to operate with both reads and writes.

Disadvantages:

You'll need some logic to determine where to write (ex. a load balancer).
Most master-master systems are either loosely consistent (violating ACID) or have increased write latency due to synchronization.
Conflict resolution comes more into play as more write nodes are added and as latency increases.

Database partitioning

Partitioning of relational data usually refers to decomposing your tables either row-wise (horizontally) or column-wise (vertically).

Modified Preorder Tree Traversal (MPTT)

MPTT is a technique for storing hierarchical data in a database. The aim is to make retrieval operations very efficient.

The trade-off for this efficiency is that performing inserts and moving items around the tree is more involved, as there's some extra work required to keep the tree structure in a good state at all times.

Networking

IPC: Inter Process Communication.
TCP/IP

Security

CORS

Questions to make

Reverse interview document

Por qué el stack que usan? No como crítica pero -explicar los problemas de ese stack y ofrecer alternativas-. Entonces, qué les llevó a esa arquitectura?
Cuánto tiempo hace que trabajan ahí? Por qué? Qué es lo que más te gusta de trabajar aquí?
Tell me about the worst day (work-wise) you've had in the last six months.
Cómo reaccionó la empresa durante el COVID?
What brought you here? What keeps you here? What keeps you up at night?

In my experience it's important to not ask “easy” questions that can be answered by a simple yes/no.

Instead of: Are you friendly to remote work? Ask: Will I be working with anyone who is remote, or who works from home on a regular basis?
Instead of: Is the work life balance good? Ask: How responsive are people to emails/Slack over the weekends and after 6pm?
Instead of: Can I have a good career path? Ask: Did any of your senior engineers start out as junior engineers here?

En la línea de lo anterior: si hay algún problema grave en un fin de semana, cual es la predisposición del equipo de quedarse.

The Joel Test:
    Do you use source control?
    Can you make a build in one step?
    Do you make daily builds?
    Do you have a bug database?
    Do you fix bugs before writing new code?
    Do you have an up-to-date schedule?
    Do you have a spec?
    Do programmers have quiet working conditions?
    Do you use the best tools money can buy?
    Do you have testers?
    Do new candidates write code during their interview?
    Do you do hallway usability testing?

Programming

Tabla de Contenidos

Interview Notes

Academic subjects

Complexity

Constant example

Logarithmic example

Linear example

Quadratic example

Exponential example

Python

Examples

Generators

AsyncIO

Coding

SOLID

GRASP

Testing

Diseño de sistemas

Scaling

Load balancing pattern

Caching pattern

Scatter and gather pattern

AMQP and message queues pattern

Map-reduce pattern

Distributed systems

Bases de datos

Replication

Master-slave replication

Master-master replication

Database partitioning

Modified Preorder Tree Traversal (MPTT)

Networking

Security

Questions to make

Programming

Herramientas de usuario

Herramientas del sitio

Tabla de Contenidos

Interview Notes

Academic subjects

Complexity

Constant example

Logarithmic example

Linear example

Quadratic example

Exponential example

Python

Examples

Generators

AsyncIO

Coding

SOLID

GRASP

Testing

Diseño de sistemas

Scaling

Load balancing pattern

Caching pattern

Scatter and gather pattern

AMQP and message queues pattern

Map-reduce pattern

Distributed systems

Bases de datos

Replication

Master-slave replication

Master-master replication

Database partitioning

Modified Preorder Tree Traversal (MPTT)

Networking

Security

Questions to make

Herramientas de la página