Human Language Technology

Share this post

So it begins!

languagetech.substack.com

So it begins!

Intro to Language Technology substack

hal
May 7, 2022
4
3
Share

Intro

Welcome, anon!

Language technology (also referred to as natural language processing [NLP]) is going through a major transformation.

Just about a decado ago, NLP was a niche topic studied by academics and was somewhat successful in a handful of practical applications. The destinity of the field has changed thanks to a combination of advances in deep learning, compute, and datasets. Nowadays, modern language technology stack is increasing international trade thanks to neural machine translation, lets anyone quickly prototype their unique language application with no-code tool like GPT-3, allows you to use powerful off-the-shelf named entity recognizers to parse receits, and so on. The field is progressing at such a pace that long skeptic pieces become outdated within few weeks.

Most importantly these advances are no longer a prototype sitting at an academic lab and are making tangible impact on the world. VC are investing 8-9 figures into large language models startups, big tech contiues to fight to the best ML/AI talent, and consumers continue to reap the rewards of this technology.

But how can you benefit from it?

As exciting as it sounds, many of these advances are not easily accessible to builders (lets assume these are technically inclined people). They don’t have access to large compute, detailed expert advice, datasets to build their language technology applications, not including other miscellanious reasons. Experts know that there is a lot of sweat and work that goes into developing language technology applications.

Don’t worry anon, I am here to help you. I have 5+ years of experience in this field, have been behind many of these advances, and developed language technology applications both within large tech companies and on the side. I will help you utilize this fantastic language technology with little sweat, little money, and will make it fun!

Excited?

Child benefit is 'staying as it is', regardless of what the IMF says

Let’s dive into the high-level overview of the future posts in this substack.


What is this substack about

Say Yes to One – Growing Weisser

Building language technology applications

We will focus on describing a process behind building variety of language technology applications including but not limited to:

  • Building datasets

  • Training our own models (or using best off-the-shelf solutions)

  • Evaluating the models

  • Deployment

  • Continuous live-cycle after deploying a model

We will cover varity of language applications starting from text classification tasks like sentiment analysis, named entity recognition, entity linking to generative tasks like text generation, summarization and so on.

Long-term, I hope that this substack will be your go-to-guide on building language technology MVP.

Modern language technology developer stack

We will continuously cover all the latest tools that increase the productivity of language technology builder. Some of these tools include:

  • Best deep learning and natural language processing libraries

  • Data annotation tools

  • Almost no-code deployment solutions

I will help you focus on the right tools that will get you to language technology MVP as quickly and as cheaply as possible without any loss in performance.

Highlights of recent developments in the field

The number of ML/AI papers has been exponentially increasing. The figure below does not even count the number of blogs and open source projects released during that time.

Mario Krenn on Twitter: "The number of monthly new ML +AI papers at arXiv  seems to grow exponentially, with a doubling rate of 23months. Probably  will lead to problems for publishing in

7 years ago I could keep up with almost every paper coming on arXiv (academic preprint service) without that much sweat. Nowadays, the information overload makes it impossible. On average there are 50+ papers coming out in machine learning category on arXiv every day.

It is great that there is tons of talent pushing the boundary of our field every day. However, almost all of these papers are either incremental, rehash existing ideas in a new light, or just simply making a contribution in a very narrow problem. Unlike how you see it in the movies, the real progress in AI is a combination of many years of incremental work from different groups coming together at the right time and place.

What is the essence of PhD?

Thankfully after years of reading papers and following space, you learn to quickly understand the ideas in the papers and indentify the important ones. I will share these insights with you as well.

What is this substack NOT about

How to Back Your Own Brand Through the Power of 'No' - Mobile Marketing  Watch

Basics of natural language processing and machine learning

I am not planning to write any articles about the fundamentals of deep learning, natural language processing, and machine learning. There is a very simple reason for that. There are tons of great videos, books, and blogs out there covering the fundamentals of each discipline. I will link you to further reading when talking about un-familiar concepts, but I won’t explain them from scratch.

Here is some great materials for you to read and watch by world-class experts in the field (I am sure I missed many more amazing materials):

  • Deep Learning:

    • NYU course by Alfredo Canziani and Yann LeCun

    • Deep Learning Book by Ian Goodfellow, Yoshua Bengio, and Aaron Courville

  • Natural Language Processing:

    • Introduction to Natural Language Processing by Jacob Eisenstein

    • Stanford course on Natural Language Processing

  • Machine Learning:

    • Probabilistic Machine Learning: An Introduction by Kevin Murphy

    • Stanford course on Machine Learning

Note: I don’t recommend consuming them in one go, rather than keep it as a reference and come back to them whenever you want to learn deeper about certain topic.


Voila! Hope you got the general overview of the future posts in this substack. I am sure it will bring tons of value to many people.

I would really appreciate if you could recommend this substack to your friends who are interested in this topic!

See you next week!

That's All Folks - Bugs Bunny - YouTube
4
3
Share
3 Comments
Azuremis
Writes Geometric Intelligence
Nov 6, 2022Liked by hal

So glad I've found your substack, thanks for sharing your expertise

Expand full comment
Reply
SSMPatrón
May 8, 2022Liked by hal

Great stuff! I've been playing around with GPT-3 and JS without knowing any coding and have been able to build some cool things. Will be subscribing and looking forward to incorporating this into my business processes.

Expand full comment
Reply
1 reply
1 more comment…
Top
New
Community

No posts

Ready for more?

© 2023 hal
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing