HackerNoon Contributor Davis David Builds Datasets for African Languages

Written by davisdavid | Published 2021/10/04
Tech Story Tags: meet-the-writer | hackernoon-contributors | interview | machine-learning | data-science | blogging-fellowship | python | hackernoon

TLDR Davis David is a Data Scientist, Software Developer and Content Creator from Tanzania. His latest HackerNoon top story is the development of the Swahili news dataset for topic classification tasks in machine learning. His biggest challenge is how to explain a complex technical concept into a simple language that everyone can understand. My next big thing is to start creating e-books on Machine Learning and Data Science. Do you have a non-tech-related hobby? If yes, what is it? I love walking in the evening while listening to some music.via the TL;DR App

This story is a part of Hacker Noon's Meet the Writer series of interviews. The series is intended for tech professionals contributing the most insightful Hacker Noon stories to share more about their writing habits, ideas, and professional background (and maybe a hobby or two).

If you too would like to start contributing to Hacker Noon, you can do so here.

So let’s start! Tell us a bit about yourself. For example, name, profession, and personal interests.

My name is Davis David, a Data Scientist, Software Developer and Content Creator from Tanzania. I love coding, writing and engage with tech communities around the world.

Interesting! What was your latest Hackernoon Top story about?

My latest HackerNoon top story is the development of the Swahili news dataset for topic classification tasks in machine learning. Swahili (also known as Kiswahili) is one of the most spoken languages in Africa and we all know that news in local languages plays an important cultural role in many African countries.

So the main purpose of this article was to share some important steps I took while creating the dataset and I hope it will help someone else who is interested in developing NLP datasets for different machine learning tasks.

Do you usually write on similar topics? If not, what do you usually write about?

Yes, I write on the same topics most of the time, I tend to focus on machine learning, data science, open-source tools, datasets creations, and python in generals. Sometimes I write about different machine learning hackathons I have hosted or tech conferences I have attended.

Great! What is your usual writing routine like (if you have one?)

My writing routine is very simple, I start by finding good article ideas to write. I can find ideas from other articles I'm reading, online courses, video tutorials and dev news (Twitter and newsletters). Then I collect some ideas I want to write and d the following steps:-

  • Find a good title for my article with a good keyword
  • Find a good image that will represent the idea of my article
  • Prepare the source code that will be part of the article
  • Write the main body (include attach source code snippets)
  • Write the introduction and conclusion
  • Proofread and solve some errors (I use Grammarly tool)
  • Publish the article.

These steps can take almost 5 days a week.

Being a writer in tech can be a challenge. It’s not often our main role, but an addition to another one. What is the biggest challenge you have when it comes to writing?

When it's come to writing my biggest challenge is how to explain a complex technical concept into a simple language that everyone can understand. Since English is not my first language, this makes it a little bit harder for me. So, I try to use different examples to explain the technical concept. By doing this way, readers can understand what I'm trying to deliver in the article.

What is the next thing you hope to achieve in your career?

I'm planning to continually improve my software development and machine learning skills, especially in Natural Language Processing. Since my journey in this field started in the tech community, I will continue to help others who want to pursue their career in machine learning by developing great technical content. My next big thing is to start creating e-books on Machine Learning and Data Science.

Wow, that’s admirable. Do you have a non-tech-related hobby? If yes, what is it?

Yes, I have a few non-tech-related hobbies, I love to watch movies and football games, reading books in different areas. I also love walking in the evening while listening to some good music. My new hobby is to start traveling.

What can the Hacker Noon community expect to read from you next?

Since most of my readers in Hackernoon like my in-depth article like my pycaret article, I'm planning to write more in-depth articles in machine learning and data science.

Thanks for taking time to join our “Meet the writer” series. It was a pleasure. Do you have any closing words?

I'm happy to be part of the Hackernoon community, especially on the blogging fellowship program. My writing is improving every day, thanks to wonderful Editors at Hackernoon who make sure that I post content with good quality. I will continue sharing my knowledge and experience to help others improve their skills in Data Science, Machine learning and Python.


Written by davisdavid | Data Scientist | AI Practitioner | Software Developer| Technical Writer
Published by HackerNoon on 2021/10/04