Announcing Pantry Hopper

Has this ever happened to you?

You’re working from home, have a meeting in 20 minutes and haven’t had lunch yet. There is not enough time so you get some fast food. You later discover plenty of lunch options at home. You start running scenarios in your head, I could have:

  • had lunch at home
  • saved money / gas / environment (less waste)
  • (probably) had a healthier meal
  • …..

After experiencing this for the umpteenth time, as I was throwing away fast-food trash I realized there had to be another way.
Out of that a new application was born.

Pantry Hopper is a meal tracker decision application for the rest of us. It is designed to help answer questions such as “What’s for lunch?”. Pantry Hopper makes it dead simple to

  • Bulk add items from your receipt
  • Organize using tags
  • Scan receipt, rename and tag items

Check out Pantry Hopper-Beta

Learning from failure

I have failed a coding challenge some time ago. It looked like a simple enough problem. However, afer struggling to come up with an answer in the allowed time, I blew it.

As a software professional with about two decades of progressive experience I was disappointed. I decided to do a mini retrospective to see what I can learn from this experience; what went wrong and how I can do better next time. I skimmed through the “Cracking the Coding Interview” book in preparation for the interview. However, I still made the most common interviewing mistake. I jumped into coding too soon without formulating a solution in my head first.

The task looked pretty simple; write a function that converts an integer to roman numeral. I thought I had a pretty good understanding of the problem and that I would be able to come up with an algorithm on the spot. I was unable to find a pattern to code.

I remembered the advice to practice solving coding challenges on paper for practice and decided to give it a try. I have had good luck with what I call “table approach” where you put all the relevant variables in a table and track their values for each iteration.

Started with writing down the rules:

  • M: 1000, D: 500, C: 100, L: 50, X: 10, I: 1
  • Symbols to the right increment, symbols to the left decrement
  • Max 3 consecutive roman numerals are allowed
  • Maximum number representable in roman numerals is 3999
  • 0 has no equivalent symbol

The last three rules were not immediately obvious before writing them down. It’s easy to see the early exit conditions. If the number is 0 or greater than 3999, there is nothing to do.

Here is a table that shows the roman numerals to represent numbers 1-9 in different places.

1000s 100s 10s 1s
1 M C X I
2 MM CC XX II
3 MMM CCC XXX III
4 - CD XL IV
5 - D L V
6 - DC LX VI
7 - DCC LXX VII
8 - DCCC LXXX VIII
9 - CM XC IX

Here is the pattern I see:

Pattern
1 a
2 aa
3 aaa
4 ab
5 b
6 ba
7 baa
8 baaa
9 ac

Let’s code this pattern in javascript:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
function convertDigit(c, p, gp, digitNumber){
// c: current numeral
// p: parent numeral
// gp: grandParent numeral
let digitNumberRomanNumeralTransform = new Map([
[1, (c) => c],
[2, (c) => c + c],
[3, (c) => c + c + c],
[4, (c, p, _) => c + p],
[5, (_, p) => p],
[6, (c, p) => p + c],
[7, (c, p) => p + c + c],
[8, (c, p) => p + c + c + c],
[9, (c, _, gp) => c + gp]
])

const getTransformFunc = digitNumberRomanNumeralTransform.get(digitNumber)
const result = getTransformFunc(c, p, gp)
return result
}

Every digit in an integer can be represented by 3 roman numerals:

  • 1s: I, V, X
  • 10s: X, L, C
  • 100s: C, D, M

Sounds like a simple Map.

1
2
3
4
5
6
7
8
9
function getDigitRomanNumeralsMap(){
let digitRomanNumeralsMap = new Map([
[1, { "numeral": "I", "parentNumeral": "V", "grandParentNumeral": "X" }],
[10, { "numeral": "X", "parentNumeral": "L", "grandParentNumeral": "C" }],
[100, { "numeral": "C", "parentNumeral": "D", "grandParentNumeral": "M" }],
])

return digitRomanNumeralsMap
}

Now I can:

  • Iterate over each digit
  • Pick the 3 roman numerals for the digit place
  • Convert each digit to roman numeral
  • Concatenate roman numerals

See the code in action here.

Retro time… What have I learned from this experience?

  • Read the entire question, ask for clarification and make assumptions explicit
  • Read the entire question, ask for clarification and make assumptions explicit (not a typo)
  • Do not skip the steps above regardless of how easy the question might seem
  • Avoid multitasking, formulate a solution and then write the code
  • Draw a table to visualize the available data
  • Try to detect a pattern
  • Optionally write some pseudocode to mentally test the pattern
  • Write code to implement the pattern/algorithm

You’re welcome, future self.

Coupling

This is a part of a series of posts to document my learnings after watching Udi Dahan’s Advanced Distributed Systems Design course.

Coupling is one of the fundamental concepts that affects the maintainability of software systems as discussed before. Coupling measures how dependent are the components of the system on one another.

  • Incoming coupling: who depends on you (aka afferent coupling)
  • Outcoming coupling: who you depend on (aka efferent coupling)
Component Incoming Outgoing Comment Example
A 5 0 Called by many, does not call others; not necessarily bad Logging library
B 0 5 Calls many others, not called by others; not necessarily bad UI layer code, Controllers
C 2 2 Calls many others and called by many; needs investigation Higher numbers: recipe for disaster
D 0 0 Does not call or get called by others Useless code, better deleted

Request/Response to Service Boundaries

This is a part of a series of posts to document my learnings after watching Udi Dahan’s Advanced Distributed Systems Design course.

In this post, we will conclude our request/response analysis. We know that request/response may be unreliable, slow and not scalable.

However, it is not possible to completely avoid using request/response and still build a useful software system. Is there an alternative?

There is good news and bad news. The good news is that we can create reliable, scalable and performant software systems by strictly adhering to the proven principles that have been around for a long time such as loose coupling (more on this later) and encapsulation.

The bad news is that it is challenging to adhere to the principles in the face of marketing from tech vendors and cargo cult programming (Netflix, Uber are building microservices so should we… not so fast).

All you need to know about software:
Tightly coupled designs donot work.
Highly cohesive, encapsulated software loosely coupled from other
highly cohesive, encapsulated software work.
-Udi Dahan (paraphrased)

It is sort of an anticlimactic reveal. You would be hard pressed to find any software developer who have not heard these terms.

In traditional Object Oriented Design nouns become entities, relationships between the entities lead to entity-relationship diagrams (Customer, Order, CustomerOrder etc.) and that usually leads to architectures requiring quite a bit of request/response to do anything useful.

One of the issues with this approach is that we end up putting attributes that do not need to change together in the same place. In a typical Customer among other attributes object, we may see:

  • Name
  • Date of Birth
  • Address

If a customer’s name does not change when they move, name and address donot need to be stored together.

This understanding leads us to one of the most important aspects of the distributed systems design: Finding your service boundaries.

In a nutshell,

  • Design and understand user workflows
  • Determine the pieces of data needed to enable the workflows
  • Organize pieces of data into randomly named piles
  • Identify messages/events needed for coordination between the piles
  • Name the piles (like CustomerLoyaltyService, ShippingService etc. )

When teams follow this exercise they realize that the traditional singular “Customer” entity does not exist anymore. First name and last name is in this pile(service to be named later), shipping address is in that pile and loyalty status info is in that pile over there. Effectively, we have exploded the customer entity and distributed its attributes to many piles(prospective services). Ideally, these piles have an ID, customer attributes they are responsible for and nothing else so each can totally perform their respective business capability independently. That is what makes them a service.

Service is the technical authority for a given business capability.
-Udi Dahan

Note that I am oversimplifying the process here for the purposes of brevity. The process of identifying how the domain breaks down is reportedly a long, difficult and thankless process. Adam Ralph does a great job explaining the nuances in the webinar linked above. In reality, armed with the principles such as minimal request/response and no data sharing between services (no publishing events like CustomerAddressUpdated) it is a process of trial and error.

These aspects of the service boundary discovery process make it challenging to provide a nice explanation with examples in a blog post. Even the most experienced software architects describe it similar to bumping around in the dark and trying to get a sense of what the shape of the world around you -maybe for days- until discovering some insight.

It’s no surprise then why this approach is not popular among agile teams biased towards action who prefer working software over documentation, work in sprints and timebox their “spikes”.

Request/Response Performance

This is a part of a series of posts to document my learnings after watching Udi Dahan’s Advanced Distributed Systems Design course.

We will focus on performance aspects of request/response interactions. Latency and bandwidth are two important concepts that we need to understand when discussing performance.

Latency: Time to cross the network in one direction.

Remember that latency does not include any processing time (serialization/deserialization of the payload). Furthermore, looking at the scaled latency chart will help us understand relative cost of remote calls.

1
2
var importantService = new ImportantService();
var result = importantService.Process(data);

Assuming that we’re making a remote call across the US, we can conclude that -relatively speaking- if the first line takes 1 second, the second line will take 4 years. Let that sink in for a moment.

Bandwidth: the capacity for data transfer of an electronic communications system

Bandwidth is not usually measured. Most developers are not even aware of the bandwidth that is available for their applications in production. Hence the fallacy: Bandwidth is unlimited. Traditionally, the amount of data processed by the software systems grew much faster than bandwidth capacity. The network will obviously affect the performance of your application.

Example: Gigabit internet= 128mb/sec

Available bandwidth = 128 - 60(TCP cost) - serialization_cost(bigger object graph, xml => higher cost)

Today I learned: Making remote calls can be extremely slow compared to to the rest of the application code. We can safely assume that this is going to hold true for some time.

Request/Response Scalability

This is a part of a series of posts to document my learnings after watching Udi Dahan’s Advanced Distributed Systems Design course.

It’s time to analyze the scalability characteristics of synchronous request/response operations as in the case of making REST/gRPC/Web Services to call one or more (micro)services, databases.

Typical scenario where the server is running a language with a Garbage Collector to free up memory (Java, .NET languages)

  • Client sends a request to the server
  • Server receives the request
  • Memory is allocated to get info out the request, stored on the heap
  • Server sends a request to another microservice or database
  • Server waits for a response for that request
  • Garbage Collector (GC) comes (after some miliseconds) and asks: Are you done with that memory?
  • Server: Nope, still waiting
  • GC: No problem, marks memory as Gen1
  • GC comes back (after some miliseconds) and asks: Done yet?
  • Server: Nope, still waiting
  • GC: No problem, marks memory as Gen2
  • Now that memory is pinned. The only way to get it out of Gen2 is to do a global GC.Collect
  • Response comes back but GC DOES NOT clear regular memory in Gen2, just the disposables that have gone out of scope.
  • GC.Collect will freeze ALL THE THREADS while GC goes and cleans up the memory in Gen2.

I was one of those people who thought that async/await fixes this problem.

No, it doesn’t. That just means the thread gets out of the way so you can take on another request. That actually means you’re gonna fill up faster and use more memory quicker and maybe fail quicker. It is not a solution.
-David Boike

Thanks to ADSD course and to David’s Boike’s presentation, I now understand how under sufficient load, long running queries can exhaust all the memory on a web server.

Today I learned: In addition to being unreliable request/response style interaction does not scale well under load.

Works on my machine

This is a part of a series of posts to document my learnings after watching Udi Dahan’s Advanced Distributed Systems Design course.

In the previous post, we demonstrated the unreliability of request/response.

If such a trivial example can show how unreliable request/response could be, why do we continue to write similar type of code?

The problem is request/response code may work in production without any issues for a long time. However, if and when the conditions are right we may receive a bug report about lost data and/or duplicate processing etc.

  • Unit tests pass
  • Integration tests pass
  • End to End test pass
    (If any of these had failed, we would not have deployed to production anyways right?)

At this point, we may attempt to reproduce the issue manually.
Most of the developers have some sort of local setup where we start up each dependent service on our machine and proceed with debugging. The services involved may be logically separate but they are not physically separate (all running on the same machine - more on this on another post). Manual test, click here and there. No dice, close the JIRA ticket with comment.

Cannot reproduce.

Congratulations on joining the Works on my machine club.

Works on my machine

Request/Response Reliability

This is a part of a series of posts to document my learnings after watching Udi Dahan’s Advanced Distributed Systems Design course.

I reviewed the videos multiple times and took detailed notes to really internalize the principles. These are the kind of technology agnostic principles that have been around for decades and should be relevant for years to come as opposed to the intricacies of the JavaScript framework du jour.

Let’s set the stage by stating our end goal.

Design and build performant, maintainable and resilient software systems where adding the 200th feature is as fast and easy as adding the first.

We will start with reliability. Everyone knows that network is not reliable. However, we continue to write code similar to this.

1
var result = importantService.Process(data);

Udi Dahan used timeout exception to demonstrate the reliability issues with this code snippet.

  1. Need to send some data to a remote service for a long running workflow.
  2. Invoke a synchronous call and wait for response.
  3. Get a timeout exception

Here are some possible scenarios

  • Data did not arrive at the server
  • Data arrived, service did not complete processing before timeout.

At this point client does not know what to do; this is important data that needs to processed.

  • Resend and risk duplicate processing?
  • Write service code to handle deduplication? What if we don’t own the service?
  • What do we tell the user? I know “An unexpected error occurred” :)

The bottom line is:

If you’re doing this on top of http, you’re always going to be almost there. There’s always going to be the scenario of this time I’ve done it. This time I finally solved all of the edge cases, and then some period of time later in production, some other thing will blow up and you’ll discover oh except that one.
-Udi Dahan

Today I learned: It is very difficult to build reliable systems on top of http with request/response (remote procedure calling-RPC, REST, Web Services etc.)

Remote Work Revolution

Remote work opportunities are on the rise. US Census Bureau documented that more people are working from home.

The largest programming community website Stackoverflow added a remote checkbox to their jobs page recently.

It is hard to ignore effects on recruiting.

While certain companies continue to rule out remote work (Netflix, Yahoo, Slack(!)), many others are embracing it. There are also some companies that are stuck somewhere in the middle. A prime example of this is Microsoft, there are several prominent technologists working remotely for Microsoft. However, there is no obvious way to search for a remote position in Microsoft’s careers site. Does Microsoft allow remote work? It depends…

Will the remote work trend continue or will we see companies telling employees to come back to the office (Yahoo, IBM)?

Due to the high cost of living in major tech hubs such as the Bay Area, Seattle and New York, it is not uncommon to see compensation packages (salary, bonus and equity) for experienced IT workers reaching $400K a year. On the other hand, anectodal evidence suggests that there are plenty of IT workers located elsewhere willing to accept a smaller comp package as long as they can work remotely.

Let me put on my captain obvious hat for a second…

Given that

  • there is a talent shortage in the tech industry
  • and it is projected to continue
  • and IT worker comp packages are reaching new highs
  • and IT worker comp packages are one of the biggest cost items in most companys’ balance sheets

Then

  • all other things being equal, remote work trend should continue to accelerate simply due to basic laws of supply and demand.

There are also less obvious signs of remote work revolution like a larger company that does not allow remote work acquiring a startup that actively encourages it. Microsoft’s Github acquisition is a good example. This seems to be happening to companies of all sizes across various industries effectively reversing anti-remote work policy.

The prediction sounds reasonable:

The future of work is Remote.

Trends/Predictions for the Software Industry

It is that time of the year when most people reflect on the past and make plans/predictions for the future. I have been having some conversations with my colleagues and wanted to put my thoughts on the two main trends dominating the software industry on “paper”.

Serverless and Remote Work

Serverless software applications and architectures will dominate the software landscape in the coming years in my opinion. Serverless coupled with wider adoption of remote work arrangements may have further downstream effects on everything from real estate to social mobility in the developing world.

If software is eating the world, analyzing the trends that affect software construction may yield interesting insights.

Serverless

Serverless software applications are usually

  • Event-driven ( someone clicked on a link/uploaded a file, it is 4am on a Tuesday i.e. code runs in response to an event )
  • Stateless ( don’t store any data )
  • Billed only for the milliseconds they run

It is called Serverless because someone else (like Amazon Web Services(AWS) or Microsoft Azure) runs and maintains the servers.

Why is this important? As an organization who uses/maintains software to run the business, think about the things one does NOT need to worry about :

  • Operating system updates ( ex. security updates, upgrade to Windows 16 etc. )
  • Network attack surface (no continuously running servers that may be prone to attacks)
  • Each process gets its own entire compute env (someone accidentally uploading a huge file should not affect any other users)
  • Paying for server/database licenses, system/server/network admin salaries
    (if you happen to work in one of these areas and worry about being phased out read this)

In a nutshell, serverless will free businesses to spend more of their time and resources on improving the core business. On second thought, make that smart businesses; there will always be companies that are resistant to change.

The avalanche has begun, it’s too late for the pebbles to vote. - Ambassador Kosh

If an organization can use an authentication service, data storage service, email service etc. that can scale automatically based on the need and pay for only what they use, why would they not do it? This will almost certainly be cheaper than paying software developer salaries to build these services in-house.

Given that most companies will want to do more with less, what will the already employed software engineers work on? Werner Vogels may have the answer here.

In the future, all the code you ever write will be business logic. - Werner Vogels (Chief Technology Officer of Amazon)

Coming soon Remote Work.