Amazon AWS EC2 instances propagation time

This is one of my favorite coding interview questions at the moment.
This question reflects a realistic real-world problem. It’s not just an artificial question not rooted in any form in reality.

Problem statement

Ec2 enables customers to rent virtual computers on which to run their own services.
When a request comes from one EC2 instance to another EC2 instance, there is a network latency (to send and receive the request/message).
The latency across all instances is given as an array, latencies.
We send a request from a certain EC2 instance K.
How long will it take for all EC2 instances to receive the request from K?

The interviewer may or may not give the following constraints. Following our coding question canvas, a candidate should clarify the scope of the problem.

Constraints

Method signature: public int networkDelayTime (String[] [] latencies, String k)

Input:

  • latencies
    • latencies[i] has 3 values:
    • source instance.
    • destination instance.
    • time for a request to travel from source to destination.
  • k: the EC2 instance sending the request.

Output:

  • The total latency cost for all EC2 instances to receive the request.

Others:

  • Are all the nodes connected to one another? No. Disconnected graph.
  • Is a node reachable via multiple paths? Yes.
  • If we have multiple paths to one EC2 instance, which one do we choose? The one with the lowest latency.
  • Will the graph be updated as our algorithm is running? No.
  • Sequential searches or just one? One search for now.
  • Is there any latency to hit the initial node? Because we send the message from that first node, there is no extra latency.

A candidate recently asked if he had to take into account the time to receive an acknowledgment per request (sent back to the sender). He related this problem to how UDP broadcasts a message (fire and forget). I found this reasoning brilliant. And this is actually why I love this problem so much. It’s easy to correlate with a real-world issue.

Examples

         START     5              40
            K ----------> L ------------> T
            |                             ^
          5 |                             |
            |                             |  10
            V                             |
            J ------------> H ----------> P
                   5              5

Note: ideally you would come up with something that looks closer to a graph. My ASCII art skills are very limited. Don’t spend too much time building beautiful ASCII graphs. What’s important to note here is that we have a mix of weights/latencies. L -> T has a higher latency than the others, we will see how we can deal with that below.

Solutions

Naive solution: Traverse graph and keep a running sum of latencies

Simplify the problem. Assume we don’t need the lowest / best propagation time.

Steps:

  • Build a weighted directed graph from the latencies array.
  • Traverse the graph and keep a running sum of all the weights/latencies.
    • DFS or BFS will do just fine. As we have to traverse all the nodes once in any order.
    • Check if we’ve already visited a node, if we did, just skip it.

Given the example above, here is how we solve the problem by hand:

    '.' ---> 'K' = '0.0'
    'K' ---> 'L' = '5.0'
    'L' ---> 'T' = '40.0'
    'K' ---> 'J' = '5.0'
    'J' ---> 'H' = '5.0'
    'H' ---> 'P' = '5.0'
    Total:65.0

Time Complexity:

  • Build graph: O(latencies.length).
  • Get the propagation time: O(V+E) where V is the number of vertices (aka nodes) in the graph and E is the number of edges in the graph.

Space Complexity: O(V+E)

Solution 2: Optimized propagation time

The naive solution gives a propagation time, but not necessarily the best.
For example, T can be reached via L with a latency of 40 or via P with a latency of 10. In this case, the latter is preferred so that the total propagation time to all EC2 instances is optimized.

Steps:

  • Traverse graph like explained in the naive solution.
  • Keep track of the lowest latency per node.
  • Each time you explore a node you already explored before, just check if the latency is better.
    • If better, update the best latency for the node.
    • If worse, keep the existing latency.
  • In the end, sum all the lowest latencies per node.

Code

import java.util.*;

public class Solution {
    public static void main(String args[]) {
      Ec2PropagationTime ec2 = new Ec2PropagationTime();
      double delay = ec2.networkDelayTime(new String[][]{
            {"K", "L", "5"},
            {"L", "T", "40"},
            {"K", "J", "5"},
            {"J", "H", "5"},
            {"H", "P", "5"},
            {"P", "T", "10"}
        }, "K");
      System.out.println("Delay:" + delay);
    }

    static class Ec2PropagationTime {

        private static final int FROM_NODE_COLUMN_INDEX = 0;
        private static final int TO_NODE_COLUMN_INDEX = 1;
        private static final int WEIGHT_COLUMN_INDEX = 2;

        public double networkDelayTime(String[][] times, String start) {
            Graph graph = buildGraph(times);
            Map<Node, Double> visits = getPropagationDurationPerNode(graph.roots.get(start), 0, new HashMap<Node, Double>());
            return sumAllPropagationDurations(visits);
        }

        private Graph buildGraph(String[][] rows) {
            Graph graph = new Graph();
            Node from;
            Node to;

            for(String[] row : rows) {
                from = graph.getOrCreateNode(row[FROM_NODE_COLUMN_INDEX]);
                to = graph.getOrCreateNode(row[TO_NODE_COLUMN_INDEX]);
                from.createEdge(to, Double.parseDouble(row[WEIGHT_COLUMN_INDEX]));
            }

            return graph;
        }

        private Map<Node, Double> getPropagationDurationPerNode(Node node, double latencyToReachNode, Map<Node, Double> visits) {

            System.out.println(String.format("Processing %s", node.id));

            if (visits.containsKey(node)) {
                System.out.println(String.format("Already visited %s with a latency of %s", node.id, visits.get(node)));
               if (latencyToReachNode < visits.get(node)) {
                    // better latency found
                    System.out.println(String.format("Better latency for %s:%s", node.id, latencyToReachNode));
                    visits.put(node, latencyToReachNode);
                }
                return visits;
            }

            visits.put(node, latencyToReachNode);

            for(Map.Entry<Node, Double> edge : node.edges.entrySet()) {
                visits = getPropagationDurationPerNode(edge.getKey(), edge.getValue(), visits);
            }

            return visits;

        }

        public double sumAllPropagationDurations(Map<Node, Double> visits){
            double total = 0;
            for(Map.Entry<Node, Double> visit : visits.entrySet()) {
                System.out.println(String.format("%s has a propagation time of %s", visit.getKey().id, visit.getValue()));
                total += visit.getValue();
            }
            return total;
        }
    }

    static class Graph {
        public Map<String, Node> roots;

        public Graph() {
            roots = new HashMap<String, Node>();
        }

        public Node getOrCreateNode(String value) {
            Node node;
            if(roots.containsKey(value)) {
                node = roots.get(value);
            } else {
                node = new Node(value);
                roots.put(value, node);
            }
            return node;
        }
    }

    static class Node {
        public String id;
        public Map<Node, Double> edges;

        public Node () {
            this.edges = new HashMap<Node, Double>();
        }

        public Node (String id) {
            this();
            this.id = id;
        }

        public void createEdge(Node to, double weight) {
            this.edges.put(to, weight);
        }
    }
}

Console output:

Processing K
Processing L
Processing T
Processing J
Processing H
Processing P
Processing T
Already visited T with a latency of 40.0
Better latency for T:10.0
L has a propagation time of 5.0
J has a propagation time of 5.0
T has a propagation time of 10.0
K has a propagation time of 0.0
P has a propagation time of 5.0
H has a propagation time of 5.0
Delay:30.0

Wrap-up

The problem seems simple at first. But can get challenging as new constraints are added.
This problem tests the candidate’s ability to write logical and maintainable code.
It also tests basic data structures knowledge and requires the candidate to deal with ambiguity.

Milestones:

  • Was the candidate able to recognize this as a graph question?

Some of the candidates I ask this question to failed to recognize this as a graph question. They started writing code to process the `latencies` array and work on that array only.
This quickly became unreadable and unmaintainable. They all made the same mistake, not using enough examples to clarify the scope. Visualizing examples helps a lot. See our coding interview tips.

  • Was the candidate able to model the problem as a graph?
  • Was the candidate able to think of BFS/DFS? What are the pros and cons?
  • How is the graph modeled? Adjacency list or matrix? (I personally prefer a list for readability).

Look out for two functions :

  • A function which builds the graph.
  • A master function which iterates over all vertexes and computes the sum.

Good luck!

Spread the love

About the author

Leave a Reply

Your email address will not be published. Required fields are marked *