Which Skills Are The Most Employable In Tech?


This article is a follow up to: What skills make you the most employable? Coding isn’t one of them.

Introduction

You’re looking for a new job. You’re about to interviews in many places. What do you think are the most important skills that will get you hired?

How do you think candidates will be selected when there are multiple competitors doing well enough?

1) Be Cheap

Did you know that 100% of companies only hire the employees that they can afford, as limited by their budget and cash constraints. Also, 100% of companies only hired the employees that they decided to spend money on, as decided by their culture and HR department.

Employees are insanely expensive, but companies have limited cash and budgets. It is particularly challenging to small and medium companies.

What happens when you do well at the interview but you request more than the company can OR want to pay? You don’t get the job.

What happens when there are multiple people doing well? The ones who requested a higher salary are rejected first.

Being cheap is a strength. In fact, that’s usually the only quality you have for yourself when you are young and you first start out.

As you grow up, you will become more expensive and despise cheapness. Your rent and food ain’t going to pay themselves. Your cost will be the single biggest obstacle to your employability, for your entire life.

One can ALWAYS get another job for 10% less money and 10% worse conditions (and if it’s not enough, lower it again until you get the job!). Eventually, the cycle ends when the minimum legal wage is reached, there are entire industries that exclusively hire for minimum wage.

The world is driven by costs. Everyone is looking for the unicorn employee that will do the same work for half the price.

2) Have Industry Experience

Relevant Industry experience will get you the job, 90% of the time.

That’s doing the same thing at the same place, for example:

  • A CEO at General Electric interviewing to be CEO at Uber.
  • A commodity trader at Goldman Sachs interviewing with JP Morgan to be a trader.
  • A site reliability engineer at Google interviewing with Facebook.
  • A defense contractor interviewing with a government organization for a contract position that cannot be disclosed.
  • A waitress at a restaurant interviewing at another restaurant, a few blocks away.

What do all these people have in common? They will jump through the interview pipeline like a piece of cake. It begins with the resume screening phase, when the hiring manager sees they already do the job they are applying to, and he immediately forwards them to the next guy in the hiring pipeline (possibly on top of the pile if they have a good pedigree). It ends with the interview stage, when they talk about their past experience and the interviewers realize that it’s basically the same work and environment as here. Looks like he’s more than ready to start tomorrow.

People move vertically: From a smaller to a bigger company (Startup => Google), and vice versa. Or across companies in the same league (Google/Facebook/Microsoft/Apple), and back.

Samples of vertical markets: web companies, finance companies, defense/government contracting… (Should I write an article about where to work as a developer?)

People move horizontally: Selling a highly specific skillset, broadly. (Quant in Finance => Data Scientist in a web company).

This sort of move is more risky as an employee. The environment and the job expectations may be very different (despite hinting at a common base in heavy math/statistics).

It’s a small world. Employers known each other, they do have a clue on their respective brands and hiring standards (It’s not an accident that Google carries power on your resume and it’s justified). In the extreme cases, it can go as far as knowing exactly what team to hire from -or not hire from- at their respective competitors, plus what’s the equivalent team here.

3) Be Likable and Well Rounded

Work on your presentation and oral skills. You really need to be able to introduce yourself and your work history.

Behave yourself and be helpful. You are both stuck together in the interview for one hour. That’s a preview of you working with your coworkers and clients during the day, EVERY DAY FOR THE NEXT YEAR. You are not expected to become best friend or drinking buddy, just to be workable with.

Being nice won’t give you the job, but being not nice will automatically take it away from you!

Work on your presentation and oral skills. You really need to be able to introduce yourself and your work history. Smoothly, no stuttering, no hesitation.

By the way, an interview is not the time to point out that you intend to replace your manager (unless that’s the job), that HR has old fashioned questions (where do you see yourself in 5 years?), or that Google stopped asking puzzle 10 years ago because it’s a useless indicator of job performance (How many ping pong balls can you fit in a plane? [1] ). Just answer the trivia questions. They are not trick, they are simply conversation starters to learn more about you.

To conclude, what do you think is the greatest fear of a HR person when she interviews a candidate? It’s to be faced with a dud who can’t talk. The massive awkward blanks and the replies never longer than a single word “Yes… No… Yes…”.

[1] Had jobs in aerospace and ping pong written as a hobby. Can’t tell if the interviewer is being relevant or repeating a dumb script.

Conclusion

Tell me about yourself?

Can you describe your last job? 

 

An adult talks about his work history, shake hands and gets a job. All the points above apply.

A youngster with no work history is mostly judged on his willingness to work. The cheapness compensates for the lack of experience. All the points above apply.

There are a couple of companies that give technical tests. Like Google that gives programmers a programming challenge, or a car garage that gives mechanics a change-a-wheel challenge. For these companies, you have to pass the bar of the technical test on top of all the points mentioned above. They are the exception, not the norm.

 

Advertisements

What Does It Really Take To Track A Million Cell Phones?


You can find anything and everything on the internet, yet nothing that explains how to track cell phones.

Let us clarify right away, we are not talking about how to track your own cell phone in case it’s lost or stolen. We are talking about tracking everyone that lives, breathes and wears a cell phone.

This is actually incredibly easy and we think that people should be aware of that.

If a representative of a phone service provider with 10 million customers came into my office and asked this question “What would it take to track every move of our 10 million customers?”. My answer would be “An intern and 6 months“. Then we’d insist the intern will need a desk, a computer, basic programming and algebra skills. That’s all it takes.

Imagine for a minute that you are the intern in question. Congratulations and welcome to our company! Your internship begins now, this document will introduce you to everything you need to know.

We’ll go over the basics of cellular networks, geolocation principles, technologies readily available in every cell phone and how to leverage all of that into a truly real-time planet-scale mass surveillance system.

Spoiler Alert: If you are scared of 1984 like scenarios, you may want to stop reading this and bounce to a video with Darth Vader playing the accordion.

A) Foreword

We are in a unique position with cross domain expertise. We combine experience in state-of-the-art tracking systems with past experience in the telecommunication industry.

Whether it’s locating an item in a warehouse, guiding people inside a shopping mall or
following stolen trucks. There are many legitimate use cases for tracking with as many constraints to satisfy: indoors, outdoors, with or without battery, variable precision, etc…

A phone itself comes with numerous technologies built-in: GPS, WiFi, accelerometer, compass, etc…

We’ll focus exclusively on what is needed to achieve easy, effective, reliable, mass-tracking.

B) Requirements

We want to track cell phones. Which one? ALL OF THEM.

Some constraints:

  • Cell phones are out of control
    • No physical access
    • Hardware cannot be modified
    • Software cannot be installed
  • Users are out of control
    • They will not perform any wanted action
    • They will not opt-in to anything
    • They will not consent to anything
  • Must be scalable to millions of cell phones
    • Self-explanatory

Better precision in time and position[1] is better but does not constitute a goal by itself. It has to be balanced against more important parameters like feasibility, scalability, reliability and costs of operation.

For the avoidance of doubt, we’ll call the project an utter success if we find ourselves able to pin point any cell phone being in a specific block inside a specific city, at a specific hour.

[1] A location is always a position AND a time together. It’s important to keep the two dimensions in mind.

C) Multilateration

Most systems work by “triangulation“. It’s possible to triangulate a specific position by comparing some measures to some points of reference. First things first, that’s actually called multilateration.

If you use a service like a GPS, it does all the work and gives out a position with a radius of error.

If you do the hard work yourself, either you are the guy making the GPS or you are trying to mix multiple sensors in a creative way, you need to do the hard work yourself.

Ultimately, it always comes down to 4 methods.

1) Power: Signal power

With information about the transmission power, the reception power and the medium. It’s possible to use physics wave propagation formulas to estimate the distance traveled.

In practice however, this method is extremely unreliable for radio waves, so you NEVER want to use that.

For instance, it’s typical for a long distance radio wave to go up and down 10 fold (+-10 dB) within a single second. It changes all the time and that’s when you are not moving. It gets worse when walls, windows and your head goes in and out of the track.

2) AoA: Angle of Arrival

Note: It’s called triangulation when using angles.

With the angle of a signal, it’s possible to determine that the source is within a line (or a cone). Obviously, it works better with highly directive signals.

You can surely picture a rotating radar like you’ve seen a thousand times in movies.

3) ToA: Time of Arrival

With the time and the speed of a signal, it’s easy to determine the distance. t = d/s.

Challenge: Radio waves travel at the speed of light 299 792 458 m/s.

To measure a distance with 30 cm accuracy requires to measure the time with +- 0.000000001 seconds (1 nanosecond). That is a hard problem.

4) TDoA: Time Difference of Arrival

Also based on time measurement.

It’s possible to use time differences instead of an absolute time.

time difference of arrival principles
The item to be tracked emits a pulse that is received by multiple receivers (Picture Source: Locating Lightning Strikes)

The item to be tracked emits a pulse that is received by multiple receivers. The receivers are at known locations and synchronized in time.

By measuring the time difference between the reception of the signal at the receivers, it’s possible to determine the relative distance of the source to the receivers.

Challenge: It doesn’t only require to measure time with crazy precision but also to synchronize clocks across systems.

D) Cellular Networks Principles

We’ll go through some basics about cellular networks.

1) Base Station (BTS)

A cell phones communicates with a base station.

There are two channels. One for emission (to the BTS), one for reception (from the BTS). They operate at different frequencies.

The emission channel (to the BTS) is shared by all devices. At any time, there can only be one device emitting.

2) Cellular Network

A BTS covers an area around it. Adjacent BTS form a cellular network.

Two adjacent BTS need to have different frequencies to avoid interference.

cellular network
Cellular Network

Each operator runs its own network. It may share or resell network service to other operators.

Some operators are virtual (called MVNO). They have no physical infrastructure, they exist on top of another provider. For example, giffgaff [1] runs on top of O2.

[1] Highly recommended provider in the UK.

3) Cell Density

A base station can only cover a limited amount of users. What happens when there are too many users, like in a city center instead of a village?

dense cellular network
Double the density. Quadruple the capacity.

Trivial, cells can be arranged more densely to increase the capacity.

E) Locating A Cell Phone

We saw the basics of cellular networks and the basics of multilateration.

1) Base Station

Your phone has to be in range of a BTS to work. By the simple virtue of having your phone “online“, the operator knows that you are within the range of his station.

As we said before, the density of towers can be adjusted to accommodate the density of users.

A tower has a theoretical range of up to 35 km radius. In a major city, there could be one every km; in the empty country side, there could be one every 10 km.

That’s enough to locate a phone down to one city.

BTS have to be located carefully to manage their coverage and not jam one another. An operation knows the locations of its BTS. They have to be registered officially to some sort of radio tower registry (the execution varies slightly by country).

P.S. We would like to give some free sites where you can see BTS but they tend to not live long. There is value in providing a good database so it’s never given for free (and if it does, someone will realize their mistake soon).

2) Base Stations x 6


Back to when we were in telecom, a long time ago, we had special test phones provided by the manufacturers.

Think of an old school Nokia phone, except it comes with build-in hardware and software for debugging purpose. One of the build-in tool shows detailed connectivity information, that are otherwise not available to consumers.

With that at hands, we can see that the cell phone, right in ours hands, is able to detect and maintain connectivity with 4 towers simultaneously, at all times.

Why 4? Because there are 4 in our area. The phone could do more!


A $50 cell phone, even one from a decade ago, can be simultaneously “connected” to 6 stations. This may include stations slightly beyond range, having a signal just strong enough to be detected but too weak to be used for actual communications.

As we like to illustrate nowadays in simple terms: Your phone is a wonder of technology, it will go above and beyond to keep the communication going no matter what. When you talk, one word can go to one tower and the next one to another tower, switching as often as necessary.

On a related topic, this is why you cannot find cheap jamming devices against mobiles. Phones are intended to operate in a hostile environment with thousands of phones competing for the air. A jamming device is like a garden hose in a hurricane. It’s physically impossible for any cheap pocket-size device powered by 2 AA batteries to out compete the hurricane.

To conclude this paragraph, your phone is constantly talking to multiple stations, not just one. Instead of being in a disk around a station, you can be located to the intersection of multiple disks. Handsome for tracking, not so much for your privacy.

More importantly, we need multiple points of reference to be able to perform multilateration. Here they are!

3) Angles

We said that a tower covers a radius around it. In practice, this is sub optimal so that’s not how it’s done.

Instead, a station is usually split in 3 independent beams of 120 degrees.

section antenna
A typical base station (Source: Wikipedia)

A typical BTS. Notice the triangle shape, each face covering 120 degrees.

base station setup
The arrangement of Tx and Tx. (Source: Kaithrein)

The technical setup, as recommended by a polish antenna manufacturer.

This allows to limit the positioning to 120 degrees. It’s actually very powerful, it just increased the accuracy a lot and allows for multilateration with only 2 BTS.

Geometry Trivia: The intersection of 2 circles gives 2 points (opposites to each other), it takes a third reference to find which point is the right one. Therefore multilateration always requires 3 references (e.g. the distances from 3 BTS). In practice, an angle is enough to do the distinction most of the time (e.g. angles and distances from 2 BTS).

This method requires information about antennas and directivity. We just checked one BTS database and it’s there so it looks like it’s not a problem to get. The precision will need to be tested in the wild (wave propagation and construction work are not perfect to the degree).

4) RSSI: Received Signal Strength Indicator

A phone emitter has a maximum power of 2 Watts (6 dB). A phone receiver has a typical sensitivity of 0.000000001 Watt (1 nW or -90 dB).

The air can attenuate a signal by a factor of 1 billion and your phone still works. Magic!

In a perfect world of undergraduate physics, the propagation loss in the air can be modeled with that equation.

propagation loss

With L the loss in dB, lambda is the wavelength and d is the distance, lambda and d in the same unit.

In the real world, this doesn’t apply at all. The air is not homogeneous and there are obstacles all over the place. The losses can vary by 2 orders of magnitude at any time (and it does). There is no meaningful value to be measured.

A good usage of Kalman filter may help to filter the samples but that’s both complicated and resource intensive for a mediocre result.

We’ve got much better to do than RSSI so let’s not our waste time discussing that.

5) Timing Advance

A channel is shared between many customers, each one gets very short periods of time allocated. You can read an introduction to GSM frames for details.

The time slot might be unusable in the event of an overlap with the previous or the next slot (dedicated to another phone). One thing that could cause unwanted overlap is the propagation delay from the phone to the station.

timing advance
The signal takes time to travel from a phone to the station. The delay depends how far the phone is.

Each bit is 3.69231 µs long in GSM, a radio wave can travel 1107 meters in that time. That means a phone located multiples of 1107 meters away will be multiple bits late… we don’t want that!

The propagation delay is accounted for and corrected by a mechanism called the timing advance.

The base station measures how late messages arrive and sends a correction parameter, the timing advance, back to the phone.

It’s a number between 0 and 63 indicating how much advance it should take, in multiple of 3.69231 µs.

For the purpose of geolocation, the timing advance allows to locate a cell phone within a 1107 meters annulus around the base station.

For the purpose of being a grammar nazi, the section of a disk inside a concentric disk is called an annulus.

Let’s see what this looks like if we put some circles on top of London.

london trilateration 1 crop
Timing Advance Annuluses

That’s the accuracy a single tower can give with just timing advance (ignoring angles).

 

Let’s see what the intersection of two stations looks like.

london trilateration 2 crop
Timing Advance with two stations.

That gives two possible areas. It takes a third measure to decide for sure (either an angle or a timing advance).

It’s intuitive enough. The more measures, the better.

Remember: Your cell phone is able to talk to 6 towers at all times, that can cooperate in tracking it.

It’s not always accurate but when it is, it can pinpoint you to the block you are walking in.

6) Geometry Quick Thoughts

Two dimensional intersections of disks[1] is high complexity both in terms of computational power and in terms of what a cheap intern might be able to understand.

Intersection of circles is a trivial problem though. There are known formulas that can be computed in constant time.

It can be generalized to N circles by simply applying the formula to each pair of circles. Filter out the points which are not within the intended angle and distance from the station (a basic comparison in constant time[2]).

The resulting points show something that is approximate but quick and easy to compute. Remember that we have millions of people to track in real-time and only an intern for that!

Call for comment: Dear mathematician reader, please comment if you have any advice on how to find the intersection of complex shapes. [3]

[1] Strictly speaking, this should be treated in 3D. The world is a sphere. There are variations in terrains that should be accounted for, especially in mountain regions.

[2] Angles are trivial to play with in polar coordinates (or spherical coordinates).

[3] We checked how design software handle 2D and 3D intersections (SolidWorks, Catia, AutoCad). Sadly, it is advanced mathematics AND it takes a lot of computational power.

7) Summary

Locating a cell phone:

  • A base station locates the phone inside its range (up to 35 km radius)
  • The timing advances locates the phone in a 1107 meter annulus
  • The angle splits locates the phones in a 120 degree section
  • There can be many stations participating in the process
  • They can be interpolated to improve the precision

8) Time

Remember that a position is always implicitly linked to a time. A phone is at a specific place at a specific time.

The phone wants to be connected in permanence. It is adjusting to the environment in real-time all the time. Typically, in a matter of seconds. It is mandatory for the phone to work (calls and messaging).

Being conservative, a phone should be able to be (re)located every minute.


Do the test.

Turn your phone off, send it a message, turn it on, how long to receive the message?

Put your phone in a tin box (to block signal), send it a message, take it out of the box, how long to receive the message?


F) Dependencies

There are some prerequisites to make that tracking system real and deploy it on a large-scale.

1) Base Station Database

The project requires a database of base stations.

Every provider know where they set up their stations, that’s part of the job of being a service provider. It’s a given if making the project as part of an ISP.

It should be easy enough to get a high quality database of base stations for anyone (not to confuse easy with inexpensive).

2) Logging BTS Information

The project requires access to BTS signal information.

First, there is an extensive authentication, roaming and payment system embedded in the network. This is necessary to provide service to the right user at the right time at the right price.

Second, almost every regulation in every country in the world require providers to save some usage information per user, for many years.

There is massive infrastructure already in place to log and audit accesses, down from the station, up to the high level customer subscription.

The values that are needed may or may not be saved already (Cell ID, TA, …), if they are not, they shouldn’t be very hard to add.

3) Matching Identities With Phones

Assuming that we track cell phones. The final step after a phone is located is to match that phone with the identity of a real person.

There is a whole authentication system made built-in the network. There are unique identifiers for customer contracts, sim cards, phones, etc…

Not sure the details of how this works and how this could be abused. Assume that an ISP can match any connected user with the subscriber.

G) The Known Unknown

We saw how to track every cell phone in service, easily done by the ISP of said customers (and by extension easily achieved by the NSA/GCHQ)

There are some unknowns that may affect the scale and the success of the operation. None that can impair it but some that can bring it up to a whole new level!

1) Near Range Tracking

A phone has to discover stations around it. It’s not possible to known which ones are right without trying.

Technically speaking, there is a possibility that the phone might have to broadcast and try to link to all stations in range [1].

If so, any station in an area would be able to follow any phone in proximity. National providers could track everyone everywhere since they are already cover the entire country. Rogue actors could setup dedicated networks for the sole purpose of tracking.

[1] It has to start with timing advance and authentication of the device, thus allowing for multilateration and user identity lookup right away.

2) Cross ISP Traffic

Have you ever been in an area with low reception where the phone displays “emergency services only“.

There is no reception to make regular calls, yet it can make emergency calls, probably by using other networks (read: not the one you subscribe to). This is a legal requirement, cell and service providers have to allow that.

Technically speaking, it means that there is something built-in to allow cell phones to connect to anything through any network and your phone is trying that automatically all the time. (This is similar to the previous point).

If so, it can be abused to track your phone.

3) International Roaming

Ever been to another country? Your phone work just fine, except you’re charged ten times more.

Again, this implies that the phone is connecting to anything. Better though, this implies that other providers are able to reach your current provider somehow, to confirm your access and incur your billing.

Depending on how it’s done in the details, there may or may not be an opportunity to link a cell phone back to its provider and its owner, anywhere in the world.

H) The Known Known

1) Retro and Forward Compatibility

This works on all cell phones and it worked for decades.

The technology has been out and part of every cell phone at least since the first edition of GSM, circa 1991.

There is no change with 3G, 3G+, LTE. Still works like a charm!

2) This Project Can Be Done By An Intern

The technology itself is within reach of a 15 years old. Any student who attends telecom 103 is taught enough to come up with that (if only they listened instead of playing on their phones!).

20 years ago, this might have gone unnoticed or ignored. There were only a few stations and a few users. Limited accuracy, limited user impact. It’s easy to imagine an early proof of concept that found it impossible at the time: “It’s gonna take an entire floppy disk to save the positions of 12000 customers! Oh my gosh. We’ll never have the budget for that.

Nowadays, it’s so trivial it’s frightening. Any cell provider could take an intern and make it happen in 6 months. Gotta save some signal information? It’s already done. Gotta do a bit of algebra? Nothing difficult.

3) Verizon Is Doing That Already

Feel free to read “Verizon” as any major phone provider.

Any service provider automatically gets incredible tracking capabilities and has to keep a history of it. It’s not optional. The first half comes with the phone’s infrastructure, the second half is mandated by regulations.

The core business of a provider is to provide phone service though, not to locate all customers in real-time down to the minute. There is no reason to perfect the techniques written in this document.

4) The NSA Is Doing That Already

Feel free to read the “NSA” as any state sponsored actor.

They want to track every people in the world. That’s one of their main goals. They have lots of resources dedicated to do just that. They have the ability to infiltrate providers and/or to deploy their own rogue infrastructure.

Ironically, the most awesome mass surveillance system ever invented is out there already and quite easy to use.

What are the odds that they figured it out? I’d say pretty high.

Conclusion



What’s the difference between a Nokia 3310 and an iPhone 7?

There isn’t any! As long as they are turned on, they can both locate you in real-time, 24/7, with a precision better than 1 square kilometer

 

 

 

mobile cellular subscriptions (per 100 people)
Mobile Cellular Subscriptions per 100 people (Source: The World Bank)

 

what if i told you it took 25 years to equip every human being with a personal tracking device
…and we made them pay for it!

 

The 187 Million Dollars Gmail Bug


Discovery

The scene takes place at a multi-billion dollars company, running some of the most well known e-commerce sites in the world.

It’s a cold day of winter, as every last Tuesday of the month, it’s the on-boarding for new employees.

Teams and leaders are giving various presentations about the business, the teams, their products, the partners, the competitions and many more topics. Some talks are actually quite interesting.

One of the last presenter requests participants to use the service, find something we want and go through the purchase as if we were intending to buy it.

The point of this exercise is obviously to get some opinions and feedbacks…

I’ll spare you the full narrative of the session. Except, one insignificant details toward the end, one person in the back of the room saying she cannot login to the site.

If a decade of experience has taught me anything, it’s that for every user who complains, there can be another million who experience the same issue, making your product plain unusable for all of them. Always follow on feedback, no matter how stupid or irrelevant they seem, they can be the tree that hide the forest.

What can be so special that it doesn’t work with her? Let’s find out.

Debugging

She cannot login, because she forgot her password. She did the “reset password” many times and it does NOT work.

Why would this not work? Let’s have her try and reset it one more time.

Forgot Password => Reset => “Instructions have been sent to …”

The email is allegedly sent… and it is received by her gmail.

screenshot gmail subject
New message in the inbox!

She opens it and click the reset link and it doesn’t work.

Why does this not work? The page says “invalid or expired link, try again to be sent a new email“.

So, let’s click again. Except this time, it’s my turn.

We open the gmail, it shows that there is a new unread message again. Great!

Looking carefully, there is something unusual on the bottom of the email conversation…

screenshot gmail hidden
New messages are not shown. They need to be manually expanded. (Desktop Client View)

The email is here. It’s just hidden!

It turns out that gmail is hiding new messages when they look too similar to the previous one.

All the password reset messages are there, hidden on the bottom, each with it’s own unique link. We try the latest one and it does work! Password resetted.

it's not a bug, it's a feature
Debugging complete

She’s been reading the same old email every single time. She never noticed the hidden messages on the bottom (note that they are quite difficult to spot on mobile).

Reset links are unique and invalidated when a new one is made, hence the errors about invalid links.

Let’s remember of that as a lesson in user accessibility. What may be noticed by one user may be missed by another.

Impact

This password reset procedure is for a billion dollar e-commerce site, used by millions of customers, in most of the countries in the world.

I should say that users buy things sparingly. They buy something once, go on with their lives, maybe come back one day far away. It’s reasonable to expect a sizeable user base to forget their password as often as “every single time“.

The issue is impacting all users who forgot their password (read: already registered on the site), use gmail (not sure about other clients) and don’t notice the hidden messages on the bottom.

 

Let’s see with the data team what’s the impact of this. Incidentally, they just introduced themselves during one of the presentations.

Assuming some percentages of some percentages of some statistics of our sales. (Sorry, private numbers ^^).

The direct impact of this bug is a direct loss of revenues of $187M dollars per year, simply accounting for people who are unable to login and place any order.

 

I should add that, as shown time and again during demonstrations, participants will switch to competitors within a few minutes of frustration, especially the ones who are already familiar with them. I don’t know if they are giving up faster or slower than regular users, either way that’s food for thought.

The impact accounting for direct losses, plus indirect losses, plus recurring losses, plus reputation loss, plus word of mouth loss, plus competitors stealing our business, well, recurrently stealing our business, etc, is hard to put an estimate on. It should be within some multipliers.

 

Last but not least. This is on a single business and we’ve got more than just one.

What else is impacted has yet to be determined. That will vary by how the password resets are done.

Fix

We need to force gmail to NOT collapse emails.

Having different subjects should do the trick.

Let’s append the current time to the subject. That is minor change in the line of code that generates the subject.

i dont often write code but when i do its 3m dollars per chara
What to say next time you’re asked to code in an interview

Conclusion

All the internet is potentially affected. You should check whether your business is.

pic - if google fixes google the internet could go up by 1percent of 1percent
There is a percentage of truth behind every meme.

 

What’s The Best Data Warehouse Solution? RedShift vs BigQuery vs Hadoop


That can be explained with a simple flowchart:

datawarehouse flowchart
Full Size ImagePDF Version

 

See also:

 

What’s The Best NoSQL Database? Cassandra vs MongoDB vs Redis vs ElasticSearch


That can be explained with a simple flowchart:

nosql flowchart
Full Size ImagePDF Version

 

See also:

 

Docker in Production: An Update


The previous article Docker in Production: A History of Failure was quite a hit.

After long discussions, hundreds of feedbacks, thousands of comments, meetings with various individuals and major players, more experimentation and more failures, it’s time for an update on the situation.

We’ll go over the lessons learned from all the recent interactions and articles, but first, a reminder and a bit of context.

Disclaimer: Intended Audience

The large amount of comments made it clear that the world is divided in 10 kind of people:

1) The Amateur

Running mostly test and side projects with no real users. May think that using Ubuntu beta is the norm and call anything “stable” obsolete.

I dont always make workin code but when I do it works on my machine
Can’t blame him. It worked on his machine.

2) The Professional

Running critical systems for a real business with real users, definitely accountable, probably get a phone call when shit hits the fan.

one-does-not-simply-say-well-it-worked-on-my-machine.jpg
Didn’t work on the machine that served his 586 million customers.

What Audience Are You?

There is a fine line between these worlds and they clash pretty hard when they ever meet. Obviously, they have very different standards and expectations.

One of the reason I love finance is because that it has a great culture of risk. It doesn’t mean to be risk-averse contrary to a popular belief. It means to evaluate potential risks and potential gains and weight them against each other.

You should take a minute to think about your standards. What do you expect to achieve with Docker? What do you have to lose if it crashes all systems it’s running on and corrupt the mounted volumes? These are important factor to drive your decisions.

What pushed me to publish the last article was a conversation with a guy from a random finance company, just asking my thoughts about Docker, because he was considering to consider it. Among other things, this company -and this guy in particular- manages systems that handle trillions of dollars, including the pensions of millions of Americans.

Docker is nowhere ready to handle my mother’s pension, how could anyone ever think that??? Well, it seemed the Docker experience wasn’t documented enough.

What Do You Need to Run Docker?

As you should be aware by know, Docker is highly sensitive to the kernel, the host and the filesystem it’s using. Pick the wrong combination and you’re talking kernel panic, filesystem corruption, Docker daemon lock down, etc…

I had time to collect feedback on various operating conditions and test a couple more myself.

We’ll go over the results of the research, what has been registered to work, not work, experience intermittent failures, or blow up entirely in epic proportions.

Spoiler Alert: There is nothing with or around Docker that’s guaranteed to work.

Disclaimer: Understand the Risks and the Consequences

I am biased toward my own standards (as a professional who has to handle real money) and following the feedback I got (with a bias toward reliable sources known for operating real world systems).

For instance, if a combination of operating system and filesystem is marked as “no-go: registered catastrophic filesystem failure with full volume data loss“. It is not production ready (for me) but it is good enough for a student who has to do a one-off exercise in a vagrant virtual machine.

You may or may not experience the issues mentioned. Either way, they are mentioned because they are certified to be present in the wild as confirmed by the people who hit them. If you try an environment that is similar enough, you are on the right path to become the next witness.

The worst that can -and usually- happen with Docker is that it seems okay during the proof of concepts and you’ll only begin to notice and understand issues far down the line, when you cannot easily move away from it.

CoreOS

CoreOS is an operating that can only run containers and is exclusively intended to run containers.

Last article, the conclusion was that it might be the only operating system that may be able to run Docker. This may or may not be accurate.

We abandoned the idea of running CoreOS.

First, the main benefit of Docker is to unify dev and production. Having a separate OS in production only for containers totally ruins this point.

Second, Debian (we were on Debian) announced the next major release for Q1 2017. It takes a lot of effort to understand and migrate everything to CoreOS, with no guarantee of success. It’s wiser to just wait for the next Debian.

CentOS/RHEL

CentOS/RHEL 6

Docker on CentOS/RHEL 6 is no-go: known filesystem failures, full volume data loss

  1. Various known issues with the devicemapper driver.
  2. Critical issues with LVM volumes in combination with devicemapper causing data corruption, container crash, and docker daemon freeze requiring hard reboot to fix.
  3. The Docker packages are not maintained on this distribution. There are numerous critical bug fixes that were released in the CentOS/RHEL 7 packages but were not back ported to the CentOS/RHEL 6 packages.
ship crash shipt it revert
The only sane way to migrate to Docker in a big company still running on RHEL 6 => Don’t do it!

CentOS/RHEL 7

Originally running the kernel 3, RedHat has been back porting the kernel 4 features into it, which is mandatory for running Docker.

It caused problems at time because Docker failed to detect the custom kernel version and the available features on it, thus it cannot set proper system settings and fails in various mysterious ways. Every time this happens, this can only be resolved by Docker publishing a fix on feature detection for specific kernels, which is neither a timely nor systematic process..

There are various issues with the usage of LVM volumes, depends on the version.

Otherwise, it’s a mixed bag. Your mileage may vary.

As of CentOS 7.0, RedHat recommended some settings but I can’t find the page on their website anymore. Anyway, there are a tons of critical bugfixes in later version so you MUST update to the latest version.

As of CentOS 7.2, RedHat recommends and supports exclusively XFS and they give special flags for the configuration. AUFS doesn’t exist, OverlayFS is officially considered unstable, BTRFS is beta (technology preview).

The RedHat employees are admitting themselves that they struggle pretty hard to get docker working in proper conditions, which is a major problem because they gotta resell it as part of their OpenShift offering. Try making a product on an unstable core.

If you like playing with fire, it looks like that’s the OS of choice.

Note that for once, it is a case where you surely wants to have RHEL and not CentOS, meaning timely updates and helpful support at your disposal.

Debian

Debian 8 jessie (stable)

A major cause of the issues we experienced was because our production OS was Debian stable, as explained in the previous article.

Basically, Debian froze the kernel to a version that doesn’t support anything Docker needs and the few components that are present are rigged with bugs.

Docker on Debian is major no-go: There is a wide range of bugs in the AUFS driver (but not only), usually crashing the host, potentially corrupting the data, and that’s just the tip of the iceberg.

Docker is 100% guaranteed suicide on Debian 8 and it’s been since the inception of Docker a few years ago. It’s killing me no one ever documented this earlier.

I wanted to show you a graph of AWS instances going down like dominoes but I didn’t have a good monitoring and drawing tool to do that, so instead I’ll illustrate with a piano chart that looks the same.

docker-crash-illustrated
Typical docker cascade failure in our test systems.

Typical Docker cascading failure on our test systems. A test slave crashes… the next one retries two minutes later… and dies too. This specific cascade took 6 tries to go past the bug, slightly more than usual, but nothing fancy.

You should have CloudWatch alarms to restart dead hosts automatically and send a crash notifications.

Fancy: You can also have a CloudWatch alarm to automatically send a customized issue report to your regulator whenever there is an issue persisting more than 5 minutes.

Not to brag but we got quite good at containing Docker. Forget about Chaos Monkey, that’s child play, try running trading systems handling billions of dollars on Docker [1].

[1] Please don’t do that. That’s a terrible idea.

Debian 9 stretch

Debian stretch is planned to become the stable edition in 2017. (Note: might be released as I write and edit this article).

It will feature the kernel 4.10 which is the latest LTS, published simultaneously.

At the time of release, Debian Stretch will be the most up to date stable operating system and it will allegedly have all the shiny things necessary to run Docker (until the Docker requirements change again).

It may resolve a lot of the issues and it may make a tons of new ones. We’ll see how it goes.

Ubuntu

Ubuntu has always been more up to date than the regular server distributions.

Sadly, I am not aware of any serious companies than run on Ubuntu. This has been a source of much misunderstanding in the docker community because dev and amateur bloggers try things on the latest Ubuntu (not even the LTS [1]) yet it’s utterly non representative of production systems in the real world (RHEL, CentOS, Debian or one of the exotic Unix/BSD/Solaris).

I cannot comment on the LTS 16 as I do not use it. It’s the only distribution to have Overlay2 and ZFS available, that gives some more options to be tried and maybe find something working?

The LTS 14 is a definitive no-go: Too old, don’t have the required components.

[1] I received quite a few comments and unfriendly emails of people saying to “just” use the latest Ubuntu beta. As if migrating all live systems, changing distribution and running on a beta platform that didn’t even exist at the time was an actual solution.


Update: I said I’m never coming back to Docker and certainly not to spend an hour on digging up references but I guess I have to now that they are handed to me in spectacular ways.

I received a quite insulting email from a guy who is clearly in the amateur league to say that “any idiot can run Docker on Ubuntu” then proceed to give a list of software packages and advanced system tweaks that are mandatory to run Docker on Ubuntu, that allegedly “anyone could have found in 5 seconds with Google“.

At the heart of his mail is this bug report, which is indeed the first Google result for “Ubuntu docker not working” and “Ubuntu docker crash: Ubuntu 16.04 install for 1.11.2 hangs.

This bug report, published on June 2016 highlights that the Ubuntu installer simply doesn’t work at all because it doesn’t install some dependencies which are required by Docker to run, then it’s a see of comments, user workarounds and not-giving-a-fuck #WONTFIX by Docker developers.

The last answer is given by an employee 5 months later to say that the Ubuntu installer will never be fixed, however the next major version of Docker may use something completely different that won’t be affected by this issue.

A new major version (v1.13) just got released (8 months since the report), it is not confirmed whether it is affected by the bug or not (but it is confirmed to come with breaking changes).

It’s fairly typical of what to expect from Docker. Checklist:

  • Is everything broken to the point Docker can’t run at all? YES.
  • Is it broken for all users, of say a major distribution? YES.
  • Is there a timely reply to acknowledge the issue? NO.
  • Is it confirmed that the issue is present and how severe it is? NO.
  • Is there any fix planned? NO.
  • Is there a ton of workarounds of various danger and complexity? YES.
  • Will it ever be fixed? Who knows.
  • Will the fix, if it ever comes, be backported? NEVER.
  • Is the ultimate answer to everything to just update to latest? Of course.

AWS Container Service

AWS has an AMI dedicated to running Docker. It is based on an Ubuntu.

As confirmed by internal sources, they experienced massive troubles to get Docker working in any decent condition

Ultimately, they released am AMI for it, running a custom OS with a custom docker package with custom bug fixes and custom backports. They went and are still going through extensive efforts and testing to keep things together.

If you are locked-in on Docker and running on AWS, your only salvation might be to let AWS handles it for you.

Google Container Service

Google offers containers as a service. Google merely exposes a Docker interface, the containers are run on internal google containerization technologies, that cannot possibly suffer from all the Docker implementation flaws.

Don’t get me wrong. Containers are great as a concept, the problem is not the theoretical aspect, it’s the practical implementation and tooling we have (i.e. Docker) which are experimental at best.

If you really want to play with Docker (or containers) and you are not operating on AWS, that leaves Google as the single strongest choice, better yet, it comes with Kubernetes for orchestration, making it a league of its own.

That should still be considered experimental and playing with fire. It just happens that it’s the only thing that may deliver the promises and also the only thing that comes with containers AND orchestration.

OpenShift

It’s not possible to build a stable product on a broken core, yet RedHat is trying.

From the feedback I had, they are both struggling pretty hard to mitigate the Docker issues, with variable success. Your mileage may vary.

Considering that they both appeal to large companies, who have quite a lot to lose, I’d really question the choice of going for that route (i.e. anything build on top of Docker).

You should try the regular clouds instead: AWS or Google or Azure. Using virtual machines and some of the hosted services will achieve 90% of what Docker does, 90% of what Docker doesn’t do, and it’s dependable. It’s also a better long-term strategy.

Chances are that you want to do OpenShift because you can’t do public cloud. Well, that’s a tough spot to be in. (Good luck with that. Please write a blog in reply to talk about your experience).

Summary

  • CentOS/RHEL: Russian roulette
  • Debian: Jumping off a plane naked
  • Ubuntu: Not sure Update: LOL.
  • CoreOS: Not worth the effort
  • AWS Containers: Your only salvation if you are locked-in with Docker and on AWS
  • Google Containers: The only practical way to run Docker that is not entirely insane.
  • OpenShift: Not sure. Depends how good the support and engineers can manage?

A Business Perspective

Docker has no business model and no way to monetize. It’s fair to say that they are releasing to all platforms (Mac/Windows) and integrating all kind of features (Swarm) as a desperate move to 1) not let any competitor have any distinctive feature 2) get everyone to use docker and docker tools 3) lock customers completely in their ecosystem 4) publish a ton of news, articles and releases in the process, increasing hype 5) justify their valuation.

It is extremely tough to execute an expansion both horizontally and vertically to multiple products and markets. (Ignoring whether that is an appropriate or sustainable business decision, which is a different aspect).

In the meantime, the competitors, namely Amazon, Microsoft, Google, Pivotal and RedHat all compete in various ways and make more money on containers than Docker does, while CoreOS is working an OS (CoreOS) and competing containerization technology (Rocket).

That’s a lot of big names with a lot of firepower directed to compete intensively and decisively against Docker. They have zero interest whatsoever to let Docker locks anyone. If anything, they individually and collectively have an interest in killing Docker and replacing it with something else.

Let’s call that the war of containers. We’ll see how it plays out.

Currently, Google is leading the way, they are replacing Docker and they are the only one to provide out of the box orchestration (Kubernetes).

Conclusion

Did I say that Docker is an unstable toy project?

Invariably some people will say that the issues are not real or in the past. They are not in the past, the challenges and the issues are very current and very real. There is definite proof and documentation that Docker has suffered from critical bugs making it plain unusable on ALL major distributions, bugs that ran rampant for years, some still present as of today.

If you look for any combination of “docker + version + filesystem + OS” on Google, you’ll find a trail of issues with various impact going back all the way to docker birth. It’s a mystery how something could fail that bad for that long and no one writes about it. (Actually, there are a few articles, they were just lost under the mass of advertisement and quick evaluations). The last software to achieve that level of expectation with that level of failure was MongoDB.

I didn’t manage to find anyone on the planet using Docker seriously AND successfully AND without major hassle. The experiences mentioned in this article were acquired by blood, the blood of employees and companies who learned Docker the hard way while every second of downtime was a $1000 loss.

Hopefully, you can learn from our past, as to not repeat it.

mistake - it could be that the purpose of your life is only to serve as a warning to others

If you were wondering whether you should have adopted docker years ago => The answer is hell no, you dodged a bullet. You can tell that to your boss. (It’s still not that much useful today if you don’t proper have orchestration around it, which is itself an experimental subject).

If you are wondering whether you should adopt it now… while what you run is satisfactory and you have any considerations for quality => The reasonable answer is to wait until RHEL 8 and Debian 10. No rush. Things need to mature and the packages ain’t gonna move faster than the distributions you’ll run them on.

If you like to play with fire => Full-on Google Container Engine on Google Cloud. Definitive high risk, probable high reward.

Would this article have more credibility if I linked numerous bug reports, screenshots of kernel panics, personal charts of system failures over the day, relevant forum posts and disclosed private conversations? Probably.

Do I want to spend yet-another hundred hours to dig that off, once again? Nope. I’d rather spend my evening on Tinder than Docker. Bye bye Docker.

Moving On

Back to me. My action plan to lead the way on Containers and Clouds had a major flaw I missed out, the average tenure in tech companies is still not counted in yearS, thus the year 2017 began by being poached.

Bad news: No more cloud and no more Docker where I am going. Meaning no more groundbreaking news. you are on your own to figure it out.

Good news: No more toying around with billions dollars of other people’s money… since I am moving up by at least 3 orders of magnitude! I am moderately confident that my new immediate playground may include the pensions of a few millions of Americans, including a lot of people who read this blog.

docker your pension fund 100% certified not dockeri
Rest assured: Your pension is in good hands! =D

Career Advice and Salary Negotiations: Move Early and Move Often


Context

This following are hard-earned experience for advancing a career quickly. It applies exclusively to tech hubs, in particular London, the Silicon Valley and New York.

Your mileage may vary, especially depending on your location, your experience and your skill.

Disclaimer: I’m seriously biased toward good performers. There are people who can fizz buzz and people who landed on that blog by accident.

Introduction

The fastest way to advance your career is to move early and move often, especially when you’re young.

It’s a lot about money. The only way to get substantial raises is to leave your company. This is especially true when you just start out and/or you’re seriously undervalued (pick any combination: young, first job, naive, didn’t negotiate, just came from abroad, etc…).

It’s also about long-term savings. Not negotiating your jobs will cost you millions over your lifetime. You want to get to a decent level as quickly as possible, you’ll get to stay there for your future jobs.

It’s also about opening your eyes and widening your horizons. You learn the most when you change company, put in a completely different environment. Also, having had a lot of jobs gives you points of comparison to know whether you’re in a good place or not.


Disclaimer: This article will be about getting offers while threatening to quit your job. In business words, that’s call negotiating.

Lesson #1: ALWAYS negotiate.

Lesson #2: Negotiations are based on leverage.

Having a job and having competing offers are your leverage.


Chapter 1) Ground Rules

We’ll start with some ground rules and some myth busting.

Rules #1: You will NOT resign from your current job UNTIL you have a SIGNED contract with a new company.

(I’d put that as text size 20, bold, in flickering red if that blog allowed to do text formatting).

Rules #2: You already have a job.

Having a job is the stronger leverage you have. If a prospective employer doesn’t give you the terms you want, you stay where you are, you’ve not nothing to lose.

Rules #3: Leverage.

Leverage, leverage and more leverage!!! A negotiation is all about leverage. The person with the most leverage gets what he wants.

Rules #4: No one cares. No one is gonna get hurt.

The HR/manager is hiring 10 guys a week, he’s seeing and negotiating all day along, every day, he’ll have forgotten your name by the time you leave the room or hang the phone.

Rules #5: NEVER underwrite yourself. NEVER talk yourself down. NEVER give away your positions.

Ever heard sentences like “It’s pathetic to pay so little because you know I have a family to feed and I don’t have a choice” or “I would like xxx$…


… but I’m willing to negotiate [down]

“.

You’re actively playing against yourself when you talk like that. Don’t do it!

Tip: Whenever this sort of non sense comes to your mind, slap yourself very hard in the face. Over time, your brain will learn to think better. (I am a strong believer in Positive Punishment).

(Also works if you think hard about slapping yourself, without actually slapping yourself).

Rules #6: Paper is real. Talk is cheap.

The only thing that matters is official papers with a signature on them. Talking doesn’t engage anyone into anything. If all you’ve got is a verbal promise then you’ve got nothing.

Corollary: Assume everything you’ll ever hear from a recruiter/HR/manager is a straight lie. (But what about the 20% bonus I’ve been promised? Ahah. Never existed!)

Rules #7: Be relentless, inflexible and never give up.

As one mentor taught me once, the secret to negotiations is to “be an asshole” [1]. In essence, that means to be relentless and inflexible.

Think about it for a minute. A sales person’s only goal is to close sales, a HR’s role is to hire people. They’re the same thing. If one is talking to you and is really close to getting  the sale (i.e. you) but not getting it, it’s a very frustrating position for him to be in. Remember, he HAS to sell. Eventually he’ll give up some slack to close the deal.

When you never lower your standards, you’ll never get less than your standards. But what if the negotiations goes wrong and everything is ruined? A negotiation cannot go wrong, at worst, you just call back the sales guy to accept the terms that he wanted to give you in the first place (and he’ll happily reply back because it’s closing a deal and that’s his job, he’s measured on that).

[1] Then he went on to negotiate $4M more in stock grants, while most of the employees had peanuts. It was a significant learning experience!

Rules #8: ALWAYS stay polite and courteous, no matter the circumstances.

When you receive an email that’s killing you. You write a response email that really puts that guy back in his place, then you go have a walk and take a breath. When you come back, you delete it and write the email you’re actually gonna send.

When I suggest to you stand up in an interview and leave the room (Who doesn’t love a dramatic exit?). In practice, it’s concluding in the middle of the 4th interview round that “We are not a fit for each other. Let’s stop the interview there and save both our afternoons, shall we?“, then standing up, saying goodbye to your interviewer(s) and asking politely the way to the exit.

Rules #9: Getting massive raises is fairly easy and common.

And the secret to achieve it is to start incredibly low. It’s really nothing to brag about.

Look around you for a minute. Let’s say you’re an American working at a typical tech company. Chances are that there are people earning $80k a year next to people earning $120k a year next to people earning $160k a year, for a similar position.

How does the former quickly become the later? That is definitely not by being nice and waiting patiently for a raise. One can only progress quickly by being good and being bold. Meaning, move early and move often.

Chapter 2) Be Goal Driven

Your single and only goal during a job search is to get offers, as in paper offers.

A contract is the guarantee that there is a job for you, at this place, at this moment, at this price, at these conditions. It’s both leverage and knowledge for YOU. You’re ready to go work at C company for £XX, assuming you sign.

If it’s satisfactory, you can sign and go there right away. (Once in a while, there’s a good offer on the first try). Otherwise, you can keep it in your offer letters collection. Who knows, maybe things will change in a year and you’ll reopen the conversation.

Whenever an interview process is stopped before the contract stage, it’s wasted time for everyone. You don’t know whether it would get real. You have incomplete information. You can’t sign it and work there. You can’t show it to other prospective employers.

Chapter 3) How Much Should You Ask For?

The only way to find out is to interview and get offers.

Basically, you need to perform a binary search. When you get an offer, you know that you can get at least that. Then you repeat the process a bit higher.

negotiation diagram
Binary Search applied to job offers

Note: Notice that the process goes on forever. In the real world, you have to stop at one point. Mastery is not achieved by knowing when to play (you’re always playing!) but by knowing when to close a play.

Chapter 4) Compensation Package

The only real thing in the universe is base salary. That’s the only thing you get at the end of the month for sure. That’s the first thing to negotiate.

Then comes the package which includes but is not limited to:

  • Base salary
  • Vacations, Sick days
  • Duration of your commute
  • Healthcare
  • Pension, 401k
  • Working hours
  • Food, Cafeteria
  • Shares, Stock, RSU
  • Bonus
  • Sign-on bonus

Some perks can only be known and guaranteed when in a contract (e.g. vacations). That’s why it’s very important to get to the contract stage. Some are variable and may disappear or worse, been a lie from the start (e.g. bonus, shares), these are very risky take and must be treated with caution.

You should negotiate base salary first, concrete perks (e.g. sign-on bonus, vacations) second, risky perks last (e.g. bonus, shares).

There are entire industries who get candidates by significantly over-promising and under-delivering (e.g. start-ups). To this day, I have never met a company that said they paid X amount of bonus and that amount was actually paid in full.

Usually, companies have a culture. Perhaps they promise 20% yearly bonus but in practice have been giving 10% to everyone for many years, the promise is oversold yet there is a non-negligible number that should be accounted for. The only way to have a clue is to get insider data, most of the time you can’t and you’ll have to take decisions partially blind.

Chapter 5) How Much Can You Charge?

There is no such thing as a “market rate” or a “fair compensation”. All there is are individual companies who run a specific business, generating a variable amount of cash, who’d get a variable gain from your service, with more or less favorable compensation policies.

At any point in time and space, there is always an endless supply of companies willing to hire for cheap. It’s fine when you first start out because your greatest quality is to be cheap and have low standards (we’ve all been there: you go where you can, not where you want).

As you mature and gain experience, you’ll request for more money and decent work conditions, there will be fewer places to work for.

Ultimately, if you’re a top performer and you’ve done your homework, you’ll reach a point where there is only a handful of acceptable companies in the area.

No matter where you are in your career, you should always know where will you send your resume next?

Never bother about the “how much?“. Your only concern should be the “where?”. You need to search, network and stay informed about worthwhile jobs and companies. 

Pick companies and go play binary search with them, it will tell you the how much. If you have no selection criteria (WTF?!), just pick whatever companies are in your area.

Note: The difficulty to get into a company has only a moderate correlation with how much it pays.


When there are 3 start-ups offering from $X to $2X and Goldman Sachs offering $3X. Your rate is $3*X +10% (cuz you negotiated!). It has little to do with an imaginary global tech market ruling the world and all to do with you pulling off your fingers and selling your service to the highest bidder like there’s no tomorrow.

if you want to have more you have to become more for things to improve you have to improve for things to get better you have to get better
One doesn’t get a rate, one makes it happen!

Chapter 6) Where To Work?

You gotta figure that out. Search, network, search some more.

For instance, let’s say I’m doing DevOps, SRE and system engineering for highly-scalable-low-latency-distributed-large-scale-systems.

In London, the easy money is usually in DevOps contracting (not to mention the neat flat tax rate). The hard money is usually in finance (ignoring that you can’t work in t-shirts). The usual big valley tech companies pay poorly (compared to contracting and finance).

For instance, if you hang around in the right pubs at the right time, you can overheard conversations like “A recruiter from Google contacted me to interview with them, I asked what’s their top range and they gave a number 40% less than I’m earning right now… Wow.“. Here in London all Google employees are easy fair pickings (please make sure your LinkedIn is updated!).

In SF/NY, the easy money is usually in big tech companies (Facebook, Google). A lot of the compensation comes from massive stock grant, so you have to negotiate stock hard and get it refreshed every year. (Note that the stock is liquid and publicly traded, it can be sold automatically monthly).

If you hang around on Hacker News for too long, you can find out conversations about Netflix not believing in stock and bonus. If you keep reading between the lines, you can figure out that they paid massive base salary in a big recruitment campaign to poach everyone from everywhere else.

Of course, not everything is about money, you may prefer a place with a shorter commute, less hours, or working from home 1 day a week to take care of your child. You’re considering a mix of compensation, personal criteria, location and qualifications.


Given my experience. There is no compromise to be made on compensation. The great companies are more successful all across the board, they pay more AND they have more money AND they treat their people better AND they have better work conditions AND more interesting projects AND …

Basically, there are the good organizations and the bad organizations, there is little middle ground. Given the choice, you always want to pick a 10x organization.


Chapter 7) Network, Network, Network…

Network in real life. Network online. Network at the pub. Network some more.

Do you know how much your current company pay other people around you? Well, you should!

Do you bring your co-workers one by one at lunch at a nearby pub to talk about secret compensations and office politics? Well, you should!

Do you have a LinkedIn up to date? Well, you should!

Do you have your previous co-workers on LinkedIn to follow-up on them and their new jobs, ready to message or be messaged if an opportunity pops up? Well, you should.

Do you regularly have drinks with your ex co-workers and ex-recruiters to keep up with the market and who’s where? Well, you should!

Chapter 8) Why Are We Doing All That?

We don’t negotiate salary because it’s easy, we negotiate because it’s important. Not negotiating your salary will cost you millions over your lifetime. Negotiate hard while you’re young and growing. It will impact the compensations of all your future jobs.

Have you ever asked around only to notice that 2 people performing the same work with the same qualifications have 30% difference in pay? Which one do you want to be?


when you life your life by poor standards you inflict damage on everyone who crosses your path especially those you love
Please negotiate your salary. It’s making the world a better place.

Chapter 9) Start-up Sucks

Disclaimer: I am in my “start-ups suck” phase.

Start-ups are a workaholics’ paradise. You’ll work a lot, for little money, with no healthcare and zero pension. All so you DO NOT make 1 million later because your company went bust, and if it didn’t your shares are worthless anyway.

The era of breaking the bank as a startup employee is long gone. Don’t expect to become a millionaire with that. The VC and the executives get all the returns if there are any, the employees get nothing. (Note that shares can have a negative value, they may cost you money to acquire and/or generate taxes).

Think for a minute about how much pay cut you took? Do you have bonus at all? How much your shares are REALLY worth? What’s your pension and healthcare plans? How often are you on call? How many hours a week do you work? When is the last time you had to work on a week-end?

While these issues are typical of start-ups, they can be found in any companies. Take a minute to think about your current situation and where you want to be (or not be). Would you be better off at a big tech company or a big financial institution?

Chapter 10) How Much Do You Want?

Question: “What salary do you want?”

The only acceptable answer is: “It will take xxx$ base salary to leave my current company.

Question: “What salary do you want?

Answer: “It will take xxx$ base salary to leave my current company.

Question: “What is your current salary?”

Answer: “It will take xxx$ base salary to leave my current company.

Question: “What is your current salary?”

Answer: “It will take $xxx base salary to leave my current company.

The only alternative answer is: “I need to know what kind of perks, hours and pension you have to give you a number? Let’s begin with what pension plans do you have?” [1]

You must figure out what you’re worth and you must ask for it. You won’t know at first, you’ll have no idea what their perks may be, you won’t know what you’ll work on. The entire point of this article is that you need to go get offers to find out. More interviews, more job offers, more binary search!

There is a myth that you should not give a number first, it’s 100% bullshit. If you let an employer gives a number first, he will low ball you every single time. It gets worse from there, the conversation is now anchored to that low number, it will be very difficult to negotiate up.

As a rule of thumb, if an employer says a number, it’s guaranteed 10% under what they were willing to pay without question.

Give a fix number, never give a range. Ranges are evil. Whenever a range comes up, the employer thinks of the bottom, the employee thinks of the top. You’re both pretending to be in agreement while really ignoring each other.

[1] I’m fairly confident I’ve got the best pension in town. I’m fairly confident I’ll reply to their reply with an egregious comment.


What do you have to lose by doing that? Absolutely nothing. Remember rules #1. You already have a job and you ain’t leaving until you get the terms you want written down in a contract. That’s your leverage.

I once had a conversation with an HR person in where he repeated the same question 9 times (the last 4 worded slightly differently). I repeated the same answer 8 times (then I changed the topic). This is a perfect example of being an asshole (read: relentless and inflexible).


Chapter 11) Move Early And Move Often

The average tenure for [young] qualified employees is around a year.

On the one hand, I can’t think of a single manager/HR who cares about keeping their employees. (I can think of many who pretend to care, but none who would actually lift a finger if necessary).

On the other hand, I can’t think of a single manager/HR who wouldn’t poach an employee from somewhere else [1].

The combination makes up for a very aggressive environment. Thus the best advice I can give to youngsters is to move early and move often. Don’t hate the player, hate the game.

[1] I loved that day when I joined a new job, only to find an email in my LinkedIn that evening from a Facebook recruiter inviting me to interview with them. [2]

[2] I loved that day when I interviewed with Facebook, only to get an email from a headhunter that evening to recruit for a recent finance shop in town managed by 2 well known ex-Goldman Sachs partners.


It seems that short tenure is an evolutionary trait of the industry as a whole. There is no sign of it ever getting longer any time soon.

In fast paced companies, a sizeable chunk of work can be done in 6-18 months delivered to millions of customers. It’s very project-oriented, a project is executed, quickly. In fact, even if people would like to stay, it’s not necessarily that simple, they may have to move internally or externally.

Despite the incredible turnover, it’s good enough to deliver and make huge profits. Sure it’s not optimal. A business only needs to be sustainable, not optimal.


Chapter 12) Should I Accept A Counter Offer?

Do you want to stay at your current company? If “YES”, then yes you should accept a counter offer. Otherwise, no.

Should you even bother asking for a counter offer? If you’d like to stay then yes.

Obviously, your life and your decisions depends on you so there is nothing we can say to help you there.

Please don’t trust articles which say that all people who accepted a counter-offer have left short afterwards anyway. These articles are written by recruiters who get a massive $20k-$100k commission only if you leave your company. They will say anything to get you to accept an offer with them, they are not acting in your interest. If you prefer to stay, stay!

Chapter 13) How To Ask For A Counter Offer?

Go to the people in the position of power, usually your manager (but not always, politics can be tricky). Tell him that you have a counter offer for $xxx, you’re fine here and you will stay provided by your compensation is adjusted, it’s only about money, he needs to match and he needs to match NOW because you’ve got the contract from the other company in your email ready to be signed (I love DocuSign) and you’re not gonna wait (I hate waiting).

If you’re a player: Print the half-page of the contract which contains the numbers to be matched, highlighted in bright pink, with the phone numbers written in the corner for your director and people he may want to call.

It should only take 3 minutes to expose the situation. Speak slow, speak calm. This is a perfectly normal business request. Businessmen do that all day along, every day of the year. (Don’t worry. You’re NOT doing anything wrong, you’re NOT taking anyone as hostage and you’re NOT hurting anyone by doing that. You’re just being an adult).

Don’t get into a discussion and don’t get into an argument. There isn’t a single thing he says that could matter. An experienced manager will probably have a script for that situation, don’t bother listening. It’s sure not the time to be manipulated into thinking the company has a limited raise pool or whatever he may invent.

Basically, he’s gonna need the day to talk to a few people and take a decision.

If he wants to keep you, he’ll pay. You should have a written salary amendment, signed, by the next day (2 days top). If you don’t get one, they’re not interested in keeping you.

If he doesn’t want to pay, the only two cards in his hands are to stall for time and to make [fake] verbal promises. An experienced bad-ass manager will never tell straight to your face that your request is denied, instead he’ll give excuses and talk about future raises. An alternative simple strategy is to make himself unreachable and unavailable to prevent you to talk any further and to hand your resignation (you can give it to any administrative person if needs be). Watch out for these red flags!

Chapter 14) Counter Offers Will Go Wrong

Disclaimer: I have biased toward good performers.

So far in my life, I have never seen a raise request or a counter offer go well. Or well, to be fair, I have seen a lot of 5-10% raises for a lot of people [1], and a few bigger ones.

There is a sizeable amount of them which are okay. There is also a sizeable amount of them which are for people who are seriously undervalued, who’ve been underpaid for years, still are after that raise, and would get outmatched easily by walking to the other shop across the street.

When you find out your company has undervalued you the whole time, the entire payroll is underrated, and the raise you’ve been given barely puts you on par with the low-end of the first company across the road. You are sure leaving the hell out of here.

Please refrain from sending an inflammatory goodbye email to reveal the true evil nature of your management. Stick to a short goodbye message and make sure to insert your contact info and LinkedIn so people can add you. This job is over, networking just begins!

[1] If you’ve got someone for a steal, ensuring they get a 5-10% yearly raise or a 5-10% one-off bonus is the cheapest way to make them feel appreciated and blind them just enough to the outside world.

herding sheeps
Resignations usually happen in waves.

A Personal Story:

I remember one raise I asked once, a meagre 33% that was perfectly reasonable (quite low as it turned out). They’ve invited me to a meeting with the other managers and it didn’t go as planned, I’ll spare you the details. At one point, my phone rang in the middle of the meeting. Some people barraged the request hard, be it for personal reasons or because it’s allegedly crazy to give a two digits raise to anyone. Everyone left the room in shock and despair.

I checked my phone, the missed call was from another company I interviewed with, to confirm I got the job. I called back to give my name/address and had a contract signed the next day for a substantial raise (plus additional bonus and perks).

My manager came to my office on the next Monday –interrupting me while I was setting up the printer to print my resignation letter– to say everything’s fine and I get half the raise I asked for. I nodded in acknowledgement… only to give him my resignation the next day (the printer was challenging!).

My last action at this job was to publish a job posting for my role at £120k base, which was more than any of us was earning. The rest of the team quit promptly.

gif - plane take off going wrong.gif
Didn’t plan to land a job that way.

That experience taught me 3 valuable lessons:

  1. Always have a backup plan.
  2. Imply only the single person that’s needed to bring the cash. More people can (and will) only cause troubles.
  3. Show a competing offer. Don’t talk and don’t argue. It can only go downhill.
  4. I’d be earning £120k soon.

Disclaimer: I’m looking forward to my 10th job in the industry, in the meantime, I’m just gathering materials for a future blog article: “A Retrospective on 10 Years of Salary Negotiations Gone Wrong: Still Counting“.


Chapter 15) Join The Recruitment Process

Become part of your recruitment process. That’s the best way to learn about recruiting, negotiations and one hundred other things.

For instance, after the 3rd dude in a row who failed to write a program to print number from 1 to 100, you’ll feel less like a fraud. Whereas there’s no such eyes opener as making an offer to a senior engineer who already has a job, only to have him reject you like a complete troll, stand up and leave the room.

Also, if you’re an idealist like me, that’s the only way to improve it: from the inside. And if you’re chaotic neutral idealistic like me, it can be a lot of fun, you get to see plenty of interesting people, while helping them, yourself and your company at the same time. (Chaotic Neutral Fact: Companies are ephemeral but what you learn for yourself is forever).

You’ll get a much better understanding of many things you couldn’t possibly dream of, with plenty of incredible and disastrous stories to tell (and recycle for your future negotiations).


yeah that aint happening
When you’ve been recruiting for a position you’re qualified to for 6 months and that’s the only answer you got, from the only candidate (out of 6) that passed the bar.

Look for the obvious signs, when this sort of things happen, it’s time to either get a raise or jump ship.


Conclusion

If you’re negotiating an offer, the thing that’s critical isn’t to be some kind of super genius. It’s enough to be pretty good, know what the market is paying, and have multiple offers.” — Dan Luu

Always be on the look out for better opportunities. Negotiate. Get competing offers. Get counter offers. Jump ship.

And remember that nothing can go wrong, you’ve already got a job, you’ve got nothing to lose.

the first year I was in London my compensation took 105 percent which really pissed me off because it was just shy of 10 percent a month
Shoot for the moon. Even if you miss, you’ll land among the stars.

References:

Salary Negotiation: Make More Money, Be More Valued, patio11 blog.

Don’t Call Yourself A Programmer, And Other Career Advice, patio11 blog..

Big company vs. startup work and pay, Dan Luu blog.

Developer hiring and the market for lemons, Dan Luu blog.

Salary Negotiation and Job Hunting for Developers, Twilio blog.

H1B Salary Data, base salaries are public for all H1B visa, at all companies, in all USA cities.

Rffffff

Google Cloud is 50% cheaper than AWS


Let’s revisit Google and Amazon pricing since the AWS November 2016 Price Reduction.

We’ll analyse instance costs, for various workloads and usages. All prices are given in dollars per month (720 hours) for servers located in Europe (eu-west-1).

Shared CPU Instances

Shared CPU instances give only a bit of CPU. The physical processor is over allocated and shared with many other instances running on the same host. A shared CPU instance may burst to 100% CPU usage for short periods but it may also be starved of CPU and paused. Note that these instances are cheap but they are not reliable for non-negligible continuous workloads.

google cloud vs aws pricing shared CPU instances

The smallest instances on both cloud is 500MB and a few percent of CPU. That’s the cheapest instance. It’s usable for testing and minimal needs (can’t do much with only 5% of CPU and 500MB).

The infamous t2.small and it’s rival the g1-small are usually the most common instance types in use. They come with 2GB of memory and a bit of CPU. It’s cheap and good enough for many use cases (excluding production and time critical processing, which need dedicated CPU time).

The Cheapest Production Instances

Production instances are all the instances with dedicated CPU time (i.e. everything but the shared CPU instances).

Most services will just run on the cheapest production instance available. That instance is very important because it determines the entry price and the specifications for everything.

google cloud vs aws pricing cheapest production instances

The cheapest production instance on Google Cloud is the n1-standard-1 which gives 1 CPU and 4 GB of memory.

AWS is more complex. The m3.medium is 1 CPU and 4 GB of memory. The c4.large is 2 CPU and 4 GB of memory.

m3/c3 are the previous family generation (pre-2015), using older hardware and an ancient virtualisation technology. c4/m4 are the current generation, it has enhanced networking and reserved bandwidth for EBS, among other system improvements.

Either way, the Google entry-level instance is significantly cheaper than both AWS entry-level instances. There will be a lot of these running, expect massive costs savings by using Google cloud.


I’m a believer that one should optimize for manageability and not raw costs. That means adopting c4/m4 as the standard for deployments (instead of c3/m3).

Given this decision, the smallest production instance on AWS is the c4.large (2 CPU, 4GB memory), a rather big instance when compared to the n1-standard-1 (1 CPU, 4GB memory). Why are we forced to pay for two CPUs as the minimal choice on AWS? That does set a high base price.

Not only Google is cheaper because it’s more competitive but it also offers more tailored options. The result is a massive 68% discount on the most commonly used production instance.

Personal Note: I would criticize the choice of AWS to discontinue the line of m4.medium instance type (1 CPU).


Instances by usage

A server has 3 dimensions of specifications: CPU performances, memory size and network speed.

Most applications only have a hard requirement in a single dimension. We’ll analyse the pricing separately for each usage pattern.

google cloud vs aws pricing instances by usage

Network Heavy

Typical Consumers: load balancers, file transfers, uploads/downloads, backups and generally speaking everything that uses the network.

What should we order to have  1Gbps and how much will it be?

  • The minimum on Google Cloud is the n1-highcpu-4 instance (4 CPU, 4 GB memory).
  • The minimum on AWS is the c4.4xlarge instance (16 CPU, 30 GB memory).

AWS bandwidth allowance is limited and correlated to the instance size. The big instances -with decent bandwidth- are incredibly expensive.

To give a point of comparison, the c4|m4|r3.large instances have a hard cap at 220 Mbits/s of network traffic (Note: It also applies internally within a VPC).

figure2_7001
Source: Network and cloud storage benchmark in 2015

All Google instances have significantly faster network than the equivalent [and even bigger] AWS instances, to the point where they’re not even playing in the same league.

Google has been designing networks and manufacturing their own equipment for decades. It’s fair to assume than AWS doesn’t have the technology to compete.

CPU

Typical Consumers: web servers, data analysis, simulations, processing and computations.

Google is cheaper per CPU.

Google CPU instances have half the memory of AWS CPU instances[1]. While that could have justified a 10% difference, it doesn’t justify double[2].

Note: The performances per CPU are equivalent on both cloud (though the CPU models and serial numbers may vary).

[1] A sane design decision. Most CPU bound workloads don’t need much memory. (Note: if they do, they can be run on “standard” instances).

[2] Pricing is mostly linked to CPU count. Additional memory is cheap.

Memory

Typical Consumers: database, caches and in-memory workloads.

Google is cheaper per GB of memory.

Google memory instances have 15% less memory than AWS CPU instances. While that could have justified a few percent difference, it sure as hell doesn’t justify double[2].

[2] Pricing is mostly linked to CPU count. Additional memory is cheap.

Local SSD and Scaling Up

There are software that can only scale up, typically SQL databases. A database holding tons of data will require fast local disks and truckloads of memory to operate non-sluggishly.

Scaling up is the most typical use case for beefy dedicated servers, but we’re not gonna rent a single server in another place just for one application. The cloud provider will have to accommodate that need.

Google allows to attach local 400GB SSDs to any instance type ($85 a month per disk).

Some AWS instances comes with small local SSD (16-160GB), you’re out of luck if you need more space than that. The only option to get big local SSD is the special i2 instances family, they have specifications in powers of 800GB local SSD + 4 CPU + 15 GB RAM (for $655 a month).

The Google SSD model is superior. It’s significantly more modular and cheaper (and more performant but that’s a different topic).

aws-vs-gce-pricing-instances-with-local-ssd
The requirements to fulfil are between parenthesis.

Disk Intensive Load: A job that requires high volume fast disks (i.e. local SSD) but not much memory.

AWS forces you to buy a big instance (i2.xlarge) to get enough SSD space whereas Google allows you to attach a SSD to a small instance (n1-highcpu-4). The lack of flexibility from AWS has a measurable impact, the AWS setup is 406% the costs of the Google setup to achieve the same need.

Database: A typical database. Fast storage and sizeable memory.

Bigger Database: Sometimes there is no choice but to scale up, to whatever resources are commanded by the application.

On AWS (i2.8xlarge) 32 cores, 244GB memory, 2 x 800 GB local SSD in RAID1 (+ 6 SSD unused yet gotta pay for it).

On Google Cloud (n1-highmem-32): 32 cores, 208 GB memory, 4 x 375 GB local SSD in RAID10.

This last number is meant to show that the lack of flexibility of AWS can (and will) snowball quickly. Only a very particular instance can fulfil the requirements, it comes with many cores and 4800 GB of unnecessary local SSD. The AWS bill is $4k (273%) higher than the equivalent setup on Google Cloud.

Custom Instances

Google offers custom machine types. You can pick how much CPU and memory you want, you’ll get that exact instance with a tailored pricing.

It is quite flexible. For instance, we could recreate any instance from AWS on Google Cloud.

Of course, there are physical bounds inherent to hardware (e.g. you can’t have a single core with 100 GB of memory).

Reserved Instances

Reserved Instances are bullshit!

Reserving capacity is a dangerous and antiquated pricing model that belongs to the era of the datacenter.

The numbers given in this article do not account for any AWS reservation. However, they all account for Google sustained use discount (30% automatic discount on instances that ran for the entire month).

If your infrastructure is so small that you can reserve all your 4 instances upfront, you should reconsider why you use AWS in the first place. There are more appropriate and cheaper options available.

If your infrastructure is big enough that you have dozens of servers (or thousands), you should already be aware that:

  1. Long term commitment is a huge risk. Most people underestimate it.
  2. Predictions are always off. Most people are overconfident in their predictions.
  3. You are no exception to most people.
  4. Reservation is a mess when having many AWS accounts (dev, staging, prod).
  5. Anything that is testing/transient is too short-lived to be reserved.
  6. Less than 50% of reservable stuff can actually be reserved (margin for change/error).

Most people managers are stubborn. If you your manager is stubborn and really insists on reserving instances, you should bet exclusively on “1 year full upfront“.

fishing with gr
Safety Warning: There is no confirmation button when you purchase reserved instances. You can absolutely spend $73185 without seeing nor confirming an invoice.

Conclusion

google cloud vs aws pricing summary relative costs

AWS was the first generation of cloud, Google is the second. The second generation is always better because it can learn from the mistakes of the first and it doesn’t have the old legacy to support.

2016 should be remembered as the year Google became a better choice than AWS. If 50% cheaper is not a solid argument, I don’t know what is.


References:

Cloud Storage Performance, a benchmark with graphs on network performance.

Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google’s Datacenter Network, A Google Research Paper, the story on what powers their internal network.

Amazon does everything wrong, and Google does everything right, A message by an employee from Amazon than Google, not directly relevant but still a good read.

Before And After Docker: How To Deploy An Application


Docker is a packaging and deployment system. It allows you to package an application as a “docker image“, then deploy it easily on some servers with a single “docker start <image>” command.

Packaging an application

Packaging an application without Docker

built pipeline without docker
The Standard Build Pipeline
  1. A developer pushes a change
  2. The CI sees that new code is available. It rebuilds the project, runs the tests and generates a package
  3. The CI saves all files in”dist/*” as build artifacts

The application is available for download from “ci.internal.mycompany.com/<project>/<build-id>/dist/installer.zip

Packaging an application with Docker

build pipeline with docker
The Build Pipeline with Docker
  1. A developer pushes a change
  2. The CI sees that new code is available it. It rebuilds the project, runs the tests and generates a docker image
  3. The docker image is saved to the docker registry

The application is available for download as a docker image named “auth:latest” from the registry “docker-registry.internal.mycompany.com”.

You need a CI pipeline

A CI pipeline requires a source code repository (GitLab, GitHub, VisualSVN Server) and a continuous integration system (Jenkins, GitLab CI, TeamCity). Docker also needs a docker registry.

A functional CI pipeline is a must-have for any software development project. It will ensure that your application(s) are automatically re-run, re-tested and re-packaged on every change.

The developers gotta write scripts to build their application, to run tests and to generate packages. Only the developers of an application can do that because they are the only ones to have the knowledge about how things are supposed/expected to work.

Generally speaking, the CI jobs should mostly consist into calling external scripts, like “./build.sh && ./tests.sh”. The scripts themselves must be part of the source code, they’ll evolve with the application.

You need to know your applications

Please answer the following questions:

  • What does the application need to be built?
  • What’s the command/script to build it?
  • What does the application need to run?
  • What configuration file is needed and where to put it?
  • What’s the command to start/stop the application?

You need to be able to answer all these questions, for all the applications you’re writing and managing.

If you don’t know the answers, you have a problem and Docker is NOT the solution. You gotta figure out how things works and write documentation! (Better hope the guys who were in charge are still working here and gave a thought about all that).

If you know the answers, then you’re good. You know what has to be done. Whether it will be executed by bash, ansible, DockerFile, spec or zip is just an implementation detail.

Deploying an application

Deploying an application without Docker

  1. Download the application
  2. Setup dependencies, services and configuration files
  3. Start the application
# ansible pseudo code 
hosts: hosts_auth
serial: 1 #rolling deploy, one server at a time
become: yes
 
tasks:
  name: instance is removed from the load balancer
  elb_instance:
    elb_name: auth
    instance_id: "{{ ansible_ansible_id }}"
    state: absent
 
  name: service is stopped
  service:
    name: auth
    state: stopped
 
  name: existing application is deleted
  file:
    path: /var/lib/auth/
    match: "*"
    recursive: yes
    state: absent
 
  name: application is deployed
  unarchive: 
    url: https://ci.internal.mycompany.com/auth/last/artifacts/installer.zip
    destination:: /var/lib/auth
 
  name: virtualenv is setup
  pip:
    requirements: /var/lib/auth/requirements.txt
    virtualenv: /var/lib/auth/.venv
 
  name: application configuration is updated
  template:
    src: auth.conf
    dst: /etc/mycompany/auth/auth.conf

  name: service configuration is updated
  template: 
    src: auth.service
    dst: /etc/init.d/mycompany-auth
 
  name: service is started
  service:
    name: auth
    state: running
 
  name: instance is added to the load balancer
  elb_instance:
    elb_name: auth
    instance_id: "{{ ansible_ansible_id }}"
    state: present

Deploying an application with Docker

  1. Create a configuration file
  2. Start the docker image with the configuration file
# ansible pseudo code
hosts: hosts_auth
serial: 1 #rolling deploy, one server at a time
become: yes
 
tasks:
  name: instance is removed from the load balancer
  elb_instance:
    elb_name: auth
    instance_id: "{{ ansible_ansible_id }}"
    state: absent
 
  name: container is stopped
  docker:
    name: auth
    state: stopped
 
  name: configuration is updated
  template: 
    src: auth.conf
    dst: /etc/mycompany/auth/auth.conf

  name: container is started
  docker:
    name: auth
    image: docker-registry.internal.mycompany.com/auth:latest
    state: started
    mount:
      /etc/mycompany/auth/auth.conf:/etc/mycompany/auth/auth.conf
    port:
      8101:8101
 
  name: instance is added to the load balancer
  elb_instance:
    elb_name: auth
    instance_id: "{{ ansible_ansible_id }}"
    state: present

Notable differences

With docker, the python setup/virtualenv and the service configuration is done during the image creation rather than during the deployment. (The commands are the same, they’re just done in an earlier build stage).

The configuration files are deployed on the host and mounted inside Docker. It would be possible to bake the configuration file into the image but some configurations might only be determined at deployment time and we’d rather not store secrets in the image.

Infrastructure

Docker is only a packaging and deployment tool.

Docker doesn’t handle auto scaling, it doesn’t have service discovery, it doesn’t reconfigure load balancers, it doesn’t move containers when servers fail.

Orchestration systems (notably Kubernetes) are supposed to help with that. Currently, they are quite experimental and very difficult to setup [beyond a proof of concept]. The lack of proper orchestration will limit Docker to only be a hype packaging & deployment tool for the foreseeable future.

Docker [even with Kubernetes] needs an existing environment to run, including servers and networks. It ain’t gonna install and configure itself either.

All of that has to be done manually. Order servers in the cloud. Create OS images with Packer. Configure VPC and networking with Terraform. Setup the servers and systems with Ansible. Install and deploy the applications (including docker images) with Ansible.

Cheat Sheet

  1. Figure out what is required and how to build the applications
  2. Write build, test and packaging scripts
  3. Document that in the README
  4. Setup a CI system
  5. Configure automatic builds after every change
  6. Figure out the application dependencies and how to run it
  7. Add that to the README
  8. Write deploy and setup scripts (with Ansible or Salt)

Conclusion

Packaging and deploying applications is a real and challenging job. A Debian package has some good practices and standards to follow whereas Docker comes with no good practices and no rules whatsoever. Docker is a [marketing] success in part because it gives the illusion that the task is easy, with a sense of coolness.

In practise though, it is hard and there is no way around it. You’ll have to figure out your needs and decide on a practical way to deploy and package your applications that will be tailored just for you. Docker is not the solution to the problem, it’s just a random tool among many others, that may or may not help you.

It’s fair to say that the docker ecosystem is infinitely complex and has a long learning curve. If you have neat applications with clear and limited dependencies, they should be relatively manageable and docker can’t make it any easier. On the contrary, it has the potential to make it harder.

Docker shines to package applications with complex messy dependencies (typical NodeJS and Ruby environments). The dependency hell is taken away from the host and moved into the image and the image creation scripts.

Docker is handsome for dev and test environments. It allows to run multiple applications easily on the same host, isolated from each other. Better yet, some applications have conflicting dependencies and would be impossible to run on a single host otherwise.

You should investigate a configuration management system (Ansible) if you don’t already have one. It will help you to manage, configure and setup [numerous] remote servers, à la SSH on steroid. It’s way more general and practical than Docker (and you’re gonna need it to install docker and deploy images anyway).

Reminder: In spite of the practical use cases, docker should be considered as a beta tool not quite ready for serious production.