Generate viral LinkedIn posts in your style for free.

Generate LinkedIn posts
Zach Wilson

Zach Wilson

These are the best posts from Zach Wilson.

45 viral posts with 85,638 likes, 4,225 comments, and 1,391 shares.
6 image posts, 0 carousel posts, 2 video posts, 37 text posts.

👉 Go deeper on Zach Wilson's LinkedIn with the ContentIn Chrome extension 👈

Best Posts by Zach Wilson on LinkedIn

We need to normalize taking career breaks!

I took a year off in 2020 and it was one of the best decisions of my life!

Whether you’re:
- jumping into parenthood
- needing a #mentalhealth break
- focusing on your passion projects

Breaks should be embraced for the gifts they are!

Thank you LinkedIn for rolling out this feature to show the human side of a career!
Post image by Zach Wilson
Data analysts aren't “lower“ than data scientists or data engineers. This perception is faulty for a few reasons.

- Data analysts are often the most business-savvy out of the bunch.
- Data analysts create tons of business value with the insights they deliver
- Data analysts generally know how to code
- Data analysts can grow into analytics managers and build teams just as well as data scientists and data engineers can

#dataengineering
#datascience
#datanalytics
In my 9 year career, I’ve never made it 2 years at one company.

People have asked me, “has this job hopping negatively impacted your career?”

I missed out on opportunities at Databricks and Robinhood even though the hiring manager said I was technically able to do the job. They didn’t trust I’d stick around long enough to deliver value.

One of my goals when I work at companies is to come up with my impact story to illustrate to them that even if I only stay 2 year, it’ll be a good ROI for them.

An impact story looks like this, it should be something you can say in less than 30 seconds.

At Facebook, I created a new reachability metric that shined light on how people were reacting to spammy notifications improving the product experience. I integrated growth and engagement metrics between Facebook, Messenger, Instagram and WhatsApp for the first time. I optimized the notifications machine learning pipeline that processed over 250 TBs per day to use 75% less compute.

At Netflix, I created the asset inventory master data model that allowed security teams to see how everything in the infrastructure was connected and minimize risk. I built a metric that measured the cloud costs of AB tests that allowed for smarter AB test rollout decisions. I optimized Netflix’s network log pipeline that processed over 100 TBs per hour to use 20% less compute.

Those are the two stories I told Airbnb to get the staff role I have now! Remember to think about the impact you’re having at your current role and how to explain it eloquently in a short amount of time. That’s how you can sell yourself well and get paid what you deserve!

#dataengineering
#softwareengineering
A mistake many data engineers make is thinking real-time = streaming pipelines!

Whenever a stakeholder says they want data in real time, you shouldn't default to dreams of Flink, Kafka, and watermarks.

You should clarify with precision what an acceptable amount of latency is for this use case.

Many times when a stakeholder asks for real-time data, it can be solved with an hourly batch pipeline. The incremental benefit to jump from hourly batch to streaming isn't worth it because it impacts the homogeneity of your suite of pipelines and makes the overhead maintenance much higher!

Sometimes stakeholders say real-time and what they mean is “predictable refresh rates.“ This is a sign you need to do better as a data engineer at setting SLAs for your pipelines about when they'll refresh.

#dataengineering
If you use Excel for data analytics, you’re a data analyst. You don’t have to know SQL and Python.

Don’t belittle others for using tools that are different from yours! It’s very impressive how far business can go with just Excel.

#dataanalytics
#datascience
#data
#excel
Today was the second day I’ve ever been to Airbnb’s office and it was to return my work laptop. I worked there over 500 working days and went into the office once.

Thank you Airbnb for being remote-first and a genuinely 21st century technology company!
You raised the bar for me on what to expect from companies in terms of life style freedom.

Return to Office now equals “I’m not applying to that company!”

#airbnb
#remoteworking
#dataengineering
Post image by Zach Wilson
Cmon Netflix! Is the range really $150k to $900k? I thought y’all added levels!

Imagine being on a team of engineers. Y’all are all senior engineers, one of y’all is making $150k and the other is making $900k doing the same job.

That’s what this range implies right, Netflix? That people in this role on the same team will be making this range?

Why would you put the overall market range when people care about the range for this specific job?

California transparency laws aren’t having quite the effect we would hope yet!

#compensation
Post image by Zach Wilson
The average length of time I’ve worked for a company is 13 months over my career. Yet I’ve been able to get jobs at some really great companies and keep growing!

Job hopping stigma is a myth perpetuated by toxic employers to hold onto their employees. They need to treat you well or you should leave for greener pastures!

Keep your skills up to date, get good at communication and negotiation, and be brave! A wonderful opportunity is just around the corner!

#jobhopping
#growth
#career
I won’t be at the Databricks AI summit this year even though it’s across the street from my apartment.

Here’s why:

Last year, a Databricks cofounder asked me if he could use my content in his keynote at the summit.

I said yes. He used two different YouTube videos.

After the summit, this cofounder urged me to teach Databricks.

I decided to do so in January. And have taught over 1000 students Databricks since then.

Databricks also became the single biggest line item cost to my business.

Databricks has a startups program where they give people $50,000 in credits.

They rejected me from this program because, “they do not see value in working with me”

So, I won’t be attending this year because they aren’t very kind to bootstrapped startups.

They say I should pass that cloud cost onto the student.

Databricks stands alone here in burning students looking for affordable education.

The following companies do give me credits to allow for more affordable cloud education:

- Amazon Web Services (AWS)
- Astronomer
- Starburst
- Snowflake
- Confluent

Please support these companies because they care about students.

If you were looking to see me this year, sorry I will not be there.

Thanks for understanding! Hopefully I will see you guys next year!

Please repost to increase awareness!
Many people have asked me, “why do you put ADHD in your headline? Isn't that just broadcasting your problems to the world?“

I put ADHD in my headline to show people that you can be successful even if you have a #mentalhealth disorder.

I put ADHD in my headline so people feel comfortable messaging me about how to manage their ADHD or their kid's ADHD. I get DMs every day from people asking me how to manage their ADHD better

I put ADHD in my headline to normalize #mentalhealth disorders and to be an advocate for neurodiversity

People with ADHD deserve respect and acceptance!

Remember this not only during May but the entire year, please!
I know data engineers who know just Python and SQL who make $500k at Netflix.
You don’t need to know the high performance languages to make a killing as a data engineer!

#dataengineering
I love Data Analytics!

As a Data Engineer for the past 9 years I find myself loving the work of Data Analytics more and more. I think there is actually a lot of crossover and overlap to Data Engineer as well.

If you can learn how to create clear dashboards from SQL queries it adds so much value. Most companies aren't dealing with petabytes of data so keeping things simple and working with things like Tableau, Excel, or Power BI is a fantastic skill to learn.

Then working on metrics to guide business decisions and measure impact with experiments is the next step.

I think the rise of Data Analysts + Data Engineers mixing to be Analytics Engineers will be a big thing in the future.

Inspired by Alex Freberg
Layoff compassion fatigue is setting in. You can see by the engagement rates here on LinkedIn.

The Twitter layoff posts last week were averaging 5k reactions.

The Meta layoff posts today are averaging 500 reactions.

Remember that we’re at the beginning of this. Layoffs are terrible and disconcerting.

We need to keep leveraging LinkedIn for its original purpose of job finding!

Keep spreading love and joy out there during these dark times!
You shouldn't try to learn all of data engineering at once! You'll get overwhelmed and feel like you aren't making any progress!

A piecemeal approach to eating the data engineering elephant is better.

Start with:
- SQL
Get good with SELECT, GROUP BY, WHERE, HAVING, JOIN, etc. DataLemur 🐒 (Ace the SQL & Data Interview) is a great resource to get into this.

Then branch into:
- Python
Get good with loops, variables, classes, dictionaries, tuples, and arrays. LeetCode still seems like the best place to practice this.

Then branch into:
- job orchestration
Airflow is the most popular option here but the startup costs are kind of high to get going. A new option is Mage that has a very easy startup and can orchestrate things as well as Airflow.

Then branch into:
- distributed compute (Snowflake, Spark, BigQuery, etc)
Almost all of these platforms have free trials. Some key things here to learn about are partitioning, memory, broadcast joins, and caching.

Then branch into:
- data modeling
Learn about fact tables, dimension tables, slowly changing dimensions, cumulative table design, and change data capture. Reading one of Bill Inmon's books about this will get you ahead here!


#dataengineering
Hey everybody! I’m excited to announce I’m going to be opening my calendar up to anybody for a 30 minute #dataengineering career guidance Google Meet call from 8 to 9 AM Pacific starting tomorrow for the next two weeks.

The cost of the call is $50 and all proceeds over the next two weeks will be donated to Code to Inspire, a charity that teaches women and girls in Afghanistan how to code!
Code to Inspire’s website: https://lnkd.in/etg4dmBT

EDIT: We’re fully booked now! 22 sessions and raising $1100 for women and girls in Afghanistan.

Let’s help raise $1000 for those in need by getting the career help you need!
SQL interviews are the most common interview in data engineering. Here’s the 20% of SQL that gets you past 80% of the interviews.


- Window functions, especially RANK/DENSE_RANK/ROW_NUMBER
Window functions are critical to passing DE interviews nowadays. Understanding how PARTITION BY works to slice up windows and ORDER BY works to sort windows is important. Also understanding the ROWS clause for rolling SUM/AVG questions!

- COUNT(CASE WHEN)
Doing a CASE WHEN inside an aggregation prevents an additional table scan and will make your query much more performant than using the WHERE clause. This optimization shows up a lot in interviews.

- Knowing how to reduce queries down to ANSI-SQL
You should be able to use the lowest common denominator SQL to solve the problem. Anything fancy like GROUPING SETS should be avoided since it’s not ANSI-compliant.

- know how common table expressions work
Common table expressions help make your query way more organized and readable. Using these show the interviewer you’ll write clean code on the job!

#dataengineering
Data engineering isn’t just Spark
Full stack engineering isn’t just NextJS
Data analytics isn’t just Tableau
Analytics engineering isn’t just dbt
Data science isn’t just XGBoost
Machine learning isn’t just fine tuning models
Prompt optimization isn’t just AdalFlow

Stop boiling entire fields down to one technology and recognize every field here has a ton of breadth!
I was on a first date with a woman today and one of you randomly came up to me and were liked “your videos changed my life, dude, thank you”

And the woman was like, “does that happen often?”

And I was like, “only every day of my life”

That moment makes the last 5 years of shouting endlessly into the void about data pipelines worth it!

Content creation is crazy!
Figuring out which data role you should become can be challenging!

One dimension to look at is how much time you spend digging into the data vs how much time you spend building infrastructure!

Obviously other factors like pay, responsibilities, and skill set are also factors.

I’ve tried most of these roles throughout my career.

Where should ML engineer fall on this continuum?


#datascience
#dataengineering
Post image by Zach Wilson
If you’re less than 30 years old and scared you don’t have things “figured out”

I blew up my life at 29.

I stepped off the “career ladder”

I no longer idolized being a principal engineer!

I reinvented myself at 29 and my life got much better!

I have a theory that I’ll be reinventing myself every 7 years.

So even if you have it “figured out” now. You won’t in the future! And that should take some of the pressure off!

You don’t need to solve all of the worlds problems today.

Just make your bed, write some code and be grateful that you’re healthy!

#dataengineering
YouTube has paid me $411 so far for releasing five data engineering boot camp videos.

The equivalent of <25% of one students tuition.

This is why most educators don’t publish on YouTube. Educational content doesn’t garner enough eyeballs to be worth it.

Do I regret doing this? Not even a little bit.

This is why:

- the free content has supercharged my January boot camp sales. Instead of averaging 1-2 sales $
($2000-5000) per day, I’m averaging 4-5 sales ($8000-12000) per day.

Releasing the free boot camp has generated $60k in paid boot camp sales in the same week.

So maybe more educators need to quit gate keeping their knowledge and realize releasing quality content on YouTube is NOT ZERO SUM. You make the pie bigger and you help people out who cannot afford it!

#dataengineering
Post image by Zach Wilson
Every SQL concept you should know to ace #dataengineering interviews:

- Basics
SELECT, FROM, WHERE, GROUP BY, ORDER BY and HAVING

- Window functions
Know the difference between RANK vs DENSE_RANK vs ROW_NUMBER

Know how PARTITION BY and ORDER BY work in the OVER clause

Know about the QUALIFY clause if you really want to wow the interviewer

- JOINs
Know when self-joins work well
Know LEFT vs FULL OUTER vs INNER joins
Know how to handle skewed joins and the tradeoffs of the various approaches

- Advanced analytic functions
Know how to leverage GROUPING SETS, ROLLUP and CUBE

Know how to create your own UDFs to enhance your SQL

- Arrays
Know about CROSS JOIN UNNEST / LATERAL VIEW EXPLODE
Know how to TRANSFORM and REDUCE array values

What did I miss?
Databricks AI summit is amazing y’all!

Platforms are changing how data engineering is done is a way that should be welcomed and encouraged!
I’ve gone back-and-forth on this throughout my creator journey. The velocity increases they bring are worth the incremental cost!


#dataengineering
Fundamental concepts every data engineer should know because they don’t really change

- ANSI SQL
- distributed compute
- OLTP vs OLAP
- CAP theorem
- slowly-changing dimensional modeling
- fact data modeling
- logging best practices
- AVRO / Thrift schemas
- idempotent pipelines
- job orchestration
- flexible schema vs defined schema
- data quality testing

#dataengineering
Here are my rules for an interesting life:

- never renew a lease anywhere
- don’t buy a house
- always have roommates
- try anything that won’t kill you at least once
- eat food so spicy you cry
- never do the same thing longer than 2 years
- dance so hard people think you’ll kick them
- keep your body in tip top shape
- believe that you have a lot to offer
- dedicate 30 minutes a day to learning
- adopt a high-energy pet
- caffeinate yourself but not after 12 PM
- get lost in nature often
- work extremely hard on building your dreams

#mentalhealth
Being good at #dataengineering is WAY more than being a Spark or SQL wizard.

You need to be good at questioning things and pushing back on low-value requests. Communicating with your downstream users to find out their pains and forging a data model that they understand and is scalable.

You need to be able to work well as a team. Data scientists and software engineers are often in your value chain. Learning how to excite and motivate the people in your value chain will truly launch your career into the stratosphere.

You need to be good at stress management. Data engineering usually involves juggling many requests in parallel. Being able to breathe and focus on the highest value requests first is critical for long-term success as a data engineer.
Jan 6th, 2016 is a day that I will always remember. It was simultaneously the scariest and best day of my life. It marks 7 years of sobriety for me.

I was so sick and tired of being stuck in a rut with drugs. I would get sober for a month or two and then fall right back into it. It was a dance I did from 2009 to 2015. I was done! I wanted something better! I was done throwing away my potential!

I decided to move away. Far away. From Salt Lake City to Alexandria, Virginia.

No “friends” pulling me down. Just sitting with my thoughts so I could focus on my success. Changing my environment changed my chances of success.

7 months after getting sober was when I landed my big break doing #dataengineering for Facebook.

After a year of being sober, so many aspects of my #mentalhealth improved.

I used to have panic attacks almost every day. I now rarely have them.

I used to not sleep much at all. Now I sleep soundly.

I used to not be very confident that I was smart and could do great things. Now I’m fearless with my capabilities and I know I’m going to do great things.

Getting sober was the number one best decision I had to make to build my success!
Adhoc SQL queries and SQL queries running in production will generally look pretty different. Copy and pasting the data scientist’s query into Airflow isn’t quite enough to be considered “putting into production”

Some high-level things to look for in ad-hoc queries that should be changed before moving to production.

1. GROUP BY 1,2,3… / ORDER BY 1,2,3
This is used to speed up writing ad-hoc queries. Please spell them out in production.

2. SELECT *
This grabs all the columns quickly for ad-hoc queries. Please spell out * in production.

3. LIMIT 1000
Limits should generally be removed when moving to production since you want the entire data set.

4. Sub queries
Sub queries should almost always be abstracted as CTEs when running in production.

5. WHERE date >= startDate
This should be switched to WHERE date BETWEEN startDate AND endDate. Otherwise your pipeline won’t be idempotent and will produce different results depending on when it’s ran.

#dataengineering
Remember to take regular breaks.

Try your best to not work for longer than 90 consecutive minutes.

Try your best to not work everyday workday for longer than 3 consecutive months.

If you take both daily short breaks and monthly long breaks, your productivity and #mentalhealth will soar!
For the sake of your coworkers #mentalhealth, don't merge your code at 4 PM today!
Deploy on Monday when you'll have more time to babysit it!

#softwareengineering
You can’t just hire a data scientist or a data engineer and expect to unlock all the value out of your data.

You need to treat your data as an investment. You invest in infrastructure and it pays you back over time.

This infrastructure has many pieces:

- logging
- pipelines
- analytics
- models
- experimentation
- decision making processes

You need to hire people to handle different pieces of this stack.

If the data isn’t actually incorporated into decision making processes, then you’re still missing a huge part of the infrastructure even if you have a rockstar data team.

#dataengineering
#datascience
Many data engineers get filtered during the data modeling round of the #dataengineering interview.

Some key things to know:

- when to use normalized vs denormalized data
- diagramming skills to sketch out the one->many, many->many relationships
- be able to talk soundly about dimension, fact, and aggregate tables.
- be able to talk about efficient table designs like cumulative, slowly-change dimension, and delta tables

Some key things to do in the interview:

- ask about schema
- clarify your relationship assumptions
- ask about business use case and query patterns

Candidates that do these tend to get hired
As you grow in your career, automating more and more of your life becomes necessary.
For example, I pay people to clean my house, schedule my appointments, grocery shop, and do my yard work.
I do this because the penalty to my productivity is too high if I do these things myself. Not having food in the fridge because I got caught up coding something very important happened too many times.
The more things I move off my plate, the more I have a fresh slate to be creative and productive in the way that society really needs me to be.

#career
#work
Databricks featured my YouTube video from 2021 in a keynote today!

I didn’t expect my pandemic hair to be broadcast to so many people live at the same time 😆!


#dataengineering
Post image by Zach Wilson
Facebook becoming Meta really makes the FAANG acronym obsolete. RIP FAANG
Emerging job titles in data to be aware of:

- Product Analyst
This person is like a data scientist in that they know experimentation, metrics, and story telling very well. They don’t need quite as deep machine learning skills as a data scientist though!

- Analytics Engineer
This person is like a data engineer but they usually work more in the analytics and experimentation layer instead of the raw data layer. Think of them as a blend of a data analyst and a data engineer. At Netflix these people were called spanners because they could write pipelines, metrics, visualization, and experiments.

- Software Engineer, Data
This person is like a data engineer but also owns pieces of data infrastructure. If you’re a data engineer that runs streaming jobs or owns online services, you’re probably closer to this job title than data engineer!

#dataengineering
#softwareengineering
Jan 6th, 2016 - the day I decided to get sober.

I packed up my bags and drove from Salt Lake City to Washington DC.

I was scared. I didn’t know what this new life was going to have in store for me.

I was just so sick and tired of being sick and tired!

So I left Utah despite all my friends wanting me to stay

That day will always be etched into my soul because I overcame the inertia of my home town and spread my wings and flew.

7 months after getting sober, I landed a job at Facebook and I knew my life was set. I could finally give up that survival mindset and start focusing on thriving!

Please remember if you’re suffering, you’re one good decision away from a brand new life!
Data engineers come in a few levels:

- level 1
Knows Python and SQL. Can move data from point A to point B so long as it’s not too big

- level 2
Knows distributed compute basics like BigQuery and Spark. Can move data around on the order of single terabytes

- level 3
Masters distributed compute and can build pipelines of arbitrary size

- level 4
Actually talks with stakeholders before building pipelines
Data engineering is really fun like playing with LEGOs. ETL actually stands for extra tall LEGOs. #dataengineering
The best part of being a solopreneur is never having to use JIRA ever again!

#softwareengineering
#mentalhealth
You shouldn’t always use Spark for your pipelines!

Spark really should be reserved for large scale pipelines because its distributed nature is wasteful in smaller scale environment.

In those smaller scale environments things like Presto/Trino or even Pandas/Polars will do much better.

There are exceptions here though! At Netflix I wrote a few Spark jobs that processed 1000 rows and it felt dirty.

I did this because these jobs were part of a suite of pipelines that all integrated together. It was worth it to be a little wasteful in terms of compute to enhance homogeneity and readability of the pipelines!

#dataengineering
When I worked at Airbnb, something data engineering hiring managers said was “We don't have much success hiring Meta data engineers“

The reason for this is Meta's definition of the data engineer role and Airbnb's definition were quite different!

Airbnb wanted to hire a different data engineering archetype.

Airbnb looked for strong data structures and algorithm skills, strong SQL, software engineering fundamentals, and Scala/Java programming experience.

Meta looked for strong analytical skills, strong SQL, visualization skills, some data structures and algorithms, and decent Python skills. This skill set aligned more closely with the analytics engineer role at Airbnb!

This can make hiring and interviewing for data engineering roles very frustrating!

Before you do a full onsite loop with a big tech company, make sure you know what type of data engineering role you're interviewing for!

#dataengineering
I’m officially moving out of SF on Jan 20th. Many people are wondering why?

Here’s the ranked reasons why I’m leaving this time:

- nervous system reset
SF isn’t for people looking to calm down. It’s for people looking to change the world. I need to reset my nervous system after 3 long years building DataExpert.io. I’m trying to get ahead of the burnout instead of lying to myself that it’s not happening.

- establish residency in another state
Visiting California is extremely good. Being a California resident gets you so little for paying so much. California taxes income over $600k at 13%. By choosing to be a California resident, I set an entire six weeks of work on fire every single year. I run one entire boot camp just for California. This pain will only get louder and more annoying.

- Y Combinator, HF0 Residency, Slow Ventures, and a16z speedrun made it clear that I AM NOT A FIT FOR THE SF STARTUP ECOSYSTEM. I would be 100% down to pay the 13% tax if I was getting something extraordinary from SF. I have no employees in SF. I have no investors in SF. I mostly have a ton of friends. Paying $200,000 extra per year for friendship isn’t worth it.

- reinvesting that $200,000 per year into the stock market means I can FatFIRE at 34 instead of 37. California weather isn’t worth 3 extra years of my life grinding my face off.

- a majority of San Francisco women I come across don’t want to have children. Staying here and dating women who don’t share a long term vision with me isn’t a good plan.

- Moving to a conservative state would do two things. 1. I become a big fish in a small pond. 2. I increase the odds of finding a woman who wants the same thing as me.

- Staying in SF sounds like a waste of time since it feels like it’s increasingly conflicting with my life goals. It’s time to grow up and move somewhere that’s fitting of these goals.
My path to how I became a staff engineer at Airbnb in 6 years:
1. I graduated from Weber State University in 2014 with a dual major bachelors in math and CS
2. I worked at 3 different startups in Utah and Virginia in 3 years before breaking into FAANG (Red Brain Labs, Think Big Analytics, and Research Innovations Inc)
3. I focused deeply on data eng, full stack eng and mobile eng for 4ish years.
4. I built an Android app for building Magic: the Gathering decks that got 150k installs.
5. I focused on building products and apps with users and scalability.
6. I broke into Facebook 3 years into my career as a level 3 (junior) data engineer.
7. Made it to level 4 (mid) in 9 months.
8. Wanted to be a software engineer so bad.
9. Jumped to Netflix as a senior data engineer. Switched to software after a year as a data engineer.
10. In 2020, I felt tired after working so hard for so long.
11. Quit Netflix in March 2020. unemployed for the rest of 2020
12. Started applying for staff roles in December 2020. Got a lot of “you need more experience” from many companies.
13. Crushed my interview with Airbnb.
Data modeling is far from dead! It’s actually more relevant than ever.

There’s been an interesting shift in the seas with AI. Some people saying we don’t need to do facts and dimensions anymore. This is a wild take because product analytics don’t suddenly disappear because LLM has arrived.

It seems like to me that multi-modal LLM is bringing together the three types of data:

- structured

- semi-structured

- unstructured

Dimensional modeling is still very relevant but will need to be augmented to include semi-structured outputs from the parsing of text and image data.

The necessity for complex types like VARIANT and STRUCT seems to be rising. Which is increasing the need for data modeling not decreasing it.

It feels like some company leaders now believe you can just point an LLM at a Kafka queue and have a perfect data warehouse which is still SO far from the actual reality of where data engineering sits today

Am I missing something or is the hype train just really loud right now?

Related Influencers